Jump to content


- - - - -

hypot()-like function but for 3 components


8 replies to this topic

#1 kwhatmough

    New Member

  • Members
  • Pip
  • 5 posts

Posted 01 August 2008 - 09:53 PM

I need a fast way to compute sqrt(x*x + y*y + z*z) that avoids intermediate overflow and underflow.

I know that most runtime libraries (C,C++,Java) provide a 2d hypot() function that does this for 2 variables, _AND_ I realize that I can achieve my result using 2 calls to it as follows:

hypot( hypot(x, y), z)

but that seems uneccessarily slow. Is there a faster way that still avoids intermediate overflow and underflow? I am _NOT_ looking for a fast sqrt() function here. I am well aware of the _NUMEROUS_ threads on fast sqrt() and fast reciprocal sqrt(). What I am looking for is a way to scale the 3 inputs before calling <any fast sqrt method here>, and then to un-scale afterwards to obtain the accurate result.

So I started looking at how to implement a 2d hypot() function:

Define hypot(x, y) as sqrt(x*x + y*y) but that avoids intermediate overflow and underflow:
Using the fact that sqrt(x^2+y^2) == k*sqrt((x/k)^2+(y/k)^2) for k>0
you can implement a fast hypot() by choosing k to be a power of 2 close to sqrt(x*y). Since you choose k to be a power of 2, computing k, x/k, and y/k can be done using bit twiddling and knowledge of IEEE floating point representation. I can go into that further, but my problem is how to generalize this for 3 components x,y,z without simply nesting the call as hypot(hypot(x,y),z). Any suggestions?

#2 Reedbeta

    DevMaster Staff

  • Administrators
  • 5307 posts
  • LocationBellevue, WA

Posted 01 August 2008 - 09:58 PM

Since sqrt(x*y) is the geometric mean of x and y, I'm guessing for 3D you want the geometric mean of x, y, and z, which would be the cube root of x*y*z. But I suspect that's not as easy to estimate by bit-twiddling as sqrt(x*y) is...
reedbeta.com - developer blog, OpenGL demos, and other projects

#3 kwhatmough

    New Member

  • Members
  • Pip
  • 5 posts

Posted 01 August 2008 - 10:07 PM

Thanks for the reply; I see that my reference to hypot() causes confusion but actually all I want is the magnitude of a 3d vector that avoids overflow and underflow that one would normally get in the interim calculation of

sqrt(x*x + y*y + z*z).

{Square, not cube root in my case}.

#4 kwhatmough

    New Member

  • Members
  • Pip
  • 5 posts

Posted 01 August 2008 - 10:08 PM

Sorry now I understand. You're saying cubeRoot(xyz) for the moderating factor. I see.

#5 Nils Pipenbrinck

    Senior Member

  • Members
  • PipPipPipPip
  • 597 posts

Posted 02 August 2008 - 09:05 AM

Hi.

Take a look at this paper:

http://torus.untergr...gorean_sums.pdf

It shows how to calculate the vector length in a non obvious but mathematical much more stable way. It's not *that* fast though.

Don't bookmark the pdf please. I'll take it offline in a couple of days..

Cheers,
Nils
My music: http://myspace.com/planetarchh <-- my music

My stuff: torus.untergrund.net <-- some diy electronic stuff and more.

#6 Blaxill

    Member

  • Members
  • PipPip
  • 66 posts

Posted 08 August 2008 - 07:55 PM

Can you not extend your original idea?
sqrt(x^2+y^2+z^2) == k*sqrt((x/k)^2+(y/k)^2+(z/k)^2) for k>0
Assuming you want the geometric mean (as suggested by Reedbeta) you want k to approximate (xyz)^(1/3) as a power of 2 so
2^n =  (xyz)^(1/3)
2^3n = xyz
n = (ln xyz) / 3(ln2)
Urgh maybe its not the way to go, but only one ln needs to be calculated. I don't know of any useful approximations to ln x, but it would only have to be loosely accurate
After that, x/k, y/k and z/k can be done again with bit shifts.

#7 kwhatmough

    New Member

  • Members
  • Pip
  • 5 posts

Posted 09 August 2008 - 12:31 AM

(Thanks to Nils for the fascinating paper).

Yes I believe that is the idea. If you assume IEEE format for the floats you can do some bit-fiddling to approximate a log_2 function by extracting the exponent bits.

Something like this might work for positive x:

floor(log_2(float x)) is something like: (floatToIntBits(x) >> 23) - 127

where in C you might implement floatToIntBits(x) as: *(int*)&x
or (preferably?) use an anonymous union:
union {
int asInt;
float asFloat;
} u;
u.asFloat = x;
return (u.asInt >> 23) - 127;

#8 Blaxill

    Member

  • Members
  • PipPip
  • 66 posts

Posted 09 August 2008 - 01:52 AM

So I think you want this. The 2 float/integer conversions are probably the slowest bits.

float get_k(const float &x, const float &y, const float &z)
{
	union
	{
		float mult;
		int bits;
	};
	mult = x*y*z;
	const int exp = (((*reinterpret_cast<int*>(&mult)) >> 23) & 255) - 128;
	bits = (bits & ~(255 << 23)) + 127 << 23;
	mult = (((-1.0f/3.0f) * mult + 2.0f) * mult - 2.0f/3.0f + static_cast<float>(exp))* 0.69314718f; // mult = ln(xyz)
	mult *= 0.48089834; // mult = (ln xyz) / 3(ln2) = n
	
        // k = 2^n = 2^mult
	return (1<<static_cast<int>(mult));
}


#9 rranft

    New Member

  • Members
  • Pip
  • 7 posts

Posted 09 August 2008 - 06:02 PM

for(long i = 0; i < 500000; i++){
ax = (float)rand() + 1 / 100.0f;
ay = (float)rand() + 1 / 100.0f;
az = (float)rand() + 1 / 100.0f;
bx = (float)rand() + 1 / 100.0f;
by = (float)rand() + 1 / 100.0f;
bz = (float)rand() + 1 / 100.0f;

dx = ax - bx;
dy = ay - by;
dz = az - bz;

dist = sqrt(dx + dy + dz);
}

takes 0.07 seconds on a core2 duo 2.14ghz - of course, that's all I was doing....





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users