Jump to content


Fast int<-->float conversion routines?


24 replies to this topic

#21 }:+()___ (Smile)

    Member

  • Members
  • PipPipPip
  • 169 posts

Posted 05 April 2012 - 07:13 PM

First line computes 31 - exponent.
Second computes 2^31 * mantissa (0x80000000 comes from omitted 1 in mantissa).
-(e < 32) is -1 (or 0xFFFFFFFF) if exponent >=0 and 0 if exponent < 0.
So third line computes (2^31 * mantissa) >> (31 - exponent) = mantissa * 2^exponent if exponent >= 0 and zero otherwise.

For double it will be something like
uint64_t double_to_uint64(double x)
{
	uint64_t y =  *(uint64_t *)&x;
	uint64_t e = 0x3FF + 63 - (y >> 52);
	uint64_t m = 1 << 63 | y << 11;
	return m >> e & -(e < 64);
}

Personally I prefer more black magic:
int32_t fast_round(float x)
{
	x += 12582912;
	return (*(int32_t *)&x ^ 0x4B000000) - 0x00400000;
}
but my tests show that simple (int)x will be fastest (at least under g++).
Sorry my broken english!

#22 .oisyn

    DevMaster Staff

  • Moderators
  • 1842 posts

Posted 06 April 2012 - 10:22 AM

Note that 12582912 is just 0xc00000. But why 0xc00000? I always did something like

int fast_round(float x)
{
        x += (1 << 23);
        return (int&)x & ((1 << 23) - 1);
}

I don't get the XOR either. Your method breaks down at 0x400001, mine at 0x800000. The code by GloW works for the full (positive) int range. Above method is easily explained - since floats always store their value as 1.mantissa * 2^exponent, where mantissa is 23 bits wide, adding 1 << 23 for numbers smaller than 1 << 23 makes sure that the exponent is always 23 and your number is directly encoded in the mantissa bits; no shift needed, and you can simply AND the exponent part out of the float.


Anyway, the reason that (int)x can be slow in many implementations is because it may call an implementation function that ensures the FPU rounding mode is set to truncate before the FIST(P) instruction is executed. FIST(P) isn't that slow nowadays - certainly not compared to writing a float to memory and then loading it into an int register to process some stuff. Doing bitmagic is mostly ideal when the float is already in memory, or when you're able to manipulate the bits directly in float registers such as in SSE (which has a fast convert-to-int-with-rounding-mode instruction anyway so that's a moot point, SSE3 even introduces the FPU instruction FISTTP which always truncates, regardless of FPU rounding mode)
C++ addict
-
Currently working on: the 3D engine for Tomb Raider.

#23 }:+()___ (Smile)

    Member

  • Members
  • PipPipPip
  • 169 posts

Posted 06 April 2012 - 01:34 PM

My method works for signed numbers, so accepted dynamic range is of the same size, [-0x400000; 0x400000] instead of [0; 0x800000].

And I think any decent compiler (with right flags set) nowadays uses SSEn for floating-point computations, so (int)x will be fastest.
Sorry my broken english!

#24 JarkkoL

    Senior Member

  • Members
  • PipPipPipPip
  • 475 posts

Posted 07 April 2012 - 04:07 PM

View Post.oisyn, on 25 June 2007 - 11:01 PM, said:

What kind of security are you talking about exactly?
IIRC, the only safe way to do such a cast according to C++ standard is memcpy() (or cast to char and copy byte-by-byte). You could implement raw_cast<>() using memcpy if you want to be C++ standard pedantic and have nice looking code for such casts, e.g. int x=raw_cast<int>(1.0f);
template<typename T, typename U>
T raw_cast(U v_)
{
  // some static assert here to ensure sizeof(T)==sizeof(U) and that you use only non-class types for T & U
  T v;
  memcpy(&v, &v_, sizeof(T));
  return v;
}


#25 Vilem Otte

    Valued Member

  • Members
  • PipPipPipPip
  • 345 posts

Posted 10 April 2012 - 12:25 AM

Casts are evil and should be avoided ... especially when you're using haskell :D ... okay end of joke...

# #23 }:+()___ (Smile) - you're right and all today compilers should produce SSE code (and if they don't, you're either using really old version, or don't using good optimize flags). So you can actually be happy with (int)value. Doing bit-magic trickery is outdated - it will be outperformed by SSE (or AVX).

Similarly to performing min/max - one can use black magic here, but SSE will heavily outperform it (same for this case, and probably for all black-magic cases today)
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users