0
101 Nov 10, 2005 at 13:51

I’m making a software 3d engine just for fun and I’ve reached a point where I want to do some optimizations before I add new features. I noticed an extreme drop in framerates everytime I add a new light source, so i figured my light calculations would be an obvious choice for optimization. One of the more expensive opperations in my light calculations is the two per-vertex division and square root to normalize surface-to-light and midway vector for diffuse and specular light. Now on to my question:

Is there any fast way to approximate the normalized vector? I think I read somewhere that you can approximate the length of a vector somehow so you don’t have to do a square root.

Any usefull information on this would be appreciated :)

9 Replies

0
101 Nov 10, 2005 at 15:12

One of the graphics gems books has a 2D version of a length aproximation. I implemented it in 3D. The isosurface looks like an inflated cube with the sides cut as a +.

double cVector::FastLength() const{
// The original idea for this approximation comes from one of the "Graphic Gems"-books, where I found a 2D-version of the same thing.
// This is basically the manhattan distance, with the 2 smallest factors scaled down.

double a, b, c;

a=std::fabs(x);
b=std::fabs(y);
c=std::fabs(z);

// Assigning the greatest value to a.
if((b>a)&&(b>c)){
double Temp = b;
b=a;
a=Temp;
}
else if((c>a)&&(c>b)){
double Temp = c;
c=a;
a=Temp;
}

return a+(b+c)*0.366; // I found this value optimal. There is probably no point in finetuning it further, since there still is a 20% fault... (For an int version of the same function, use a one step bitshift (a+((b+c)>>1)) instead)
}


When I tested it it was about 20% faster, but I did no optimizations whatsoever.

0
101 Nov 10, 2005 at 15:20

There is a very useful and fast approximation to 1/sqrt(x) around:

float math::fast_invSqrt(float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;         // get bits for floating value
i = 0x5f3759df - (i>>1);   // give initial guess y0
x = *(float*)&i;           // convert bits back to float
x *= 1.5f - xhalf*x*x;     // newton step, repeating this step
// increases accuracy
//x *= 1.5f - xhalf*x*x;

return x;
}


The whole thing is described in detail here:

0
101 Nov 10, 2005 at 18:27

I’m afraid neither approximation did wonders for my framerate. Perhaps i could squeeze out 1 more FPS, which isn’t really worth the artifacts it produces. Guess I’ll look for other “hot spots” to optimize.

0
101 Nov 10, 2005 at 21:00

@weevil

isn’t really worth the artifactsl

Must have looked terrible. Care to post a screenie?

0
101 Nov 11, 2005 at 00:13

I woudn’t say it looked terrible. Actually it was hard for me to produce a screenshot that clearly showed the artifacts (but it was easy to spot on a moving surface). I’ve removed diffuse and specular maps in these shots, so what you’re seeing is the actual shading that would have been applied to a texture

Reference shot made with true normalize

Notice how round the edge of the specular highlight is

Same shot made using approzimated normalize

Notice that the specular highlight forms a triangle towards the middle of the surface.

0
101 Nov 11, 2005 at 12:32

Ahh. It’s java… Then the overhead of any approximation should pretty quickly be greater than the gain.

Nice screenies, though. I like the bloom effect.

0
101 Nov 11, 2005 at 14:00

@geon

Ahh. It’s java… Then the overhead of any approximation should pretty quickly be greater than the gain. Nice screenies, though. I like the bloom effect.

Yeah i have a softspot for java :)

If you’re interested, the bloom effect is achieved by saving the overflow of each pixel in a buffer (that is, the part of the color that is not rendered because it is above 255 in intensity). When the scene is rendered i do a post process pass where i apply box blur to the overflow buffer (can be done in O(n) complexity) and blend it unto the rendering surface using additive blending.

0
102 Nov 14, 2005 at 04:45

@weevil

Yeah i have a softspot for java :)

Then you should definitely try C++. Right now! :w00t:

The fastest normalization I know uses SSE assembly. It has two instructions that approximate a square root and a reciprocal and execute in just one clock cycle. Their precision is 12 mantissa bits, which is plenty for the normalization in lighting functions. Assembly instructions are directly accessible from C++.

So, unless you really want a cross-platform software renderer, Java is a dead end street. Most approximations and low-level ‘optimizations’ just slow things down. If you want to stick with Java, forget about perfomance almost completely. With C++ and assembly you can really do some nifty stuff.

0
101 Nov 14, 2005 at 13:19

I can certainly understand your point of view Nick, square root in 1 cycle sounds allmost too good to be true :). I’ve been coding C++ (and even asm) in the past and are well aware of the advantages these languages has.

However the point of my project was never to build a 3d engine that could be used for games (or anything really) but simply to prove to myself I can do it myself, without the use of hardware. So I would rather work in an environment I feel comfortable with (Java) than an environment i hate but gives me better performance (native c++).

Trying to optimize something like square root was probably just force of habbit from my C++ days, and a case of me underestimating the overhead of envoking methods etc. in Java.