Shader Optimization

4f6bc850fb510ba557bd316e073da82a
0
jstier 101 Aug 03, 2008 at 15:49

Hi,

I am curious to see if anyone has some input on this shader for modeling tree leafs. Patches of leafs are made up of textured billboards that always face the viewer. The only input to the shaders is vertex attributes containing the coordinates for each corner of the billboard on the x-y plane. I wonder if anyone can see any performance improvements or has other comments.

The shaders are integrated with Geist3D. Anyone interested can download the Geist3D IDE and experiment with the shaders via the GLSL source code editor

The vertex shader positions the vertices of a billboard perpendicular to eye vector:

attribute vec2 gs_Attribute;

varying vec3 vLight;
varying vec3 vNormal;

void main()
{
    vec3 U = vec3(gl_ModelViewMatrixInverse[1][0],gl_ModelViewMatrixInverse[1][1],gl_ModelViewMatrixInverse[1][2]);
    vec3 V = vec3(gl_ModelViewMatrixInverse[0][0],gl_ModelViewMatrixInverse[0][1],gl_ModelViewMatrixInverse[0][2]);

    vNormal = vec3(gl_ModelViewMatrixInverse[3][0],gl_ModelViewMatrixInverse[3][1],gl_ModelViewMatrixInverse[3][2]);
    vLight = vec3(gl_ModelViewMatrixInverse * gl_LightSource[1].position);

    vec3 pos = gs_Attribute.x * V + gs_Attribute.y * U + gl_Vertex.xyz;
    gl_Position = gl_ModelViewProjectionMatrix * vec4(pos, 1.0);
    gl_TexCoord[0] = gl_MultiTexCoord0;
}

The fragment shader simply maps a texture containing a few dozen leafs onto the quad and then performs phong shading. I think this could be improved a lot by perturbing the normals. Not all leafs in the patch always face the viewer with the same angle. I am thinking of using a noise texture to perturb the normals up to +- 90 degrees:

uniform sampler2D Texture;

varying vec3 vLight;
varying vec3 vEye;
varying vec3 vNormal;

void main (void)
{
    vec4 color = texture2D (Texture, gl_TexCoord[0].st);

    vec3 light = normalize(vLight);
    vec3 normal = normalize(vNormal);
    float intensity = max(dot(light, normal), 0.0);

    color.rgb *= vec3(intensity);
    gl_FragColor = color;
}

Here is a pic of what a tree looks like:

Eco\_3.jpg

9 Replies

Please log in or register to post a reply.

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Aug 04, 2008 at 09:07

It is some fairly simple code, but you can save a rsqrt in the fragment shader:

// Same as dot(normalize(A), normalize(B)), but with 1 rsqrt less
float dot_unorm(vec3 A, vec3 B)
{
    return dot(A,B) * rsqrt(dot(A,A) * dot(B,B));
}

uniform sampler2D Texture;

varying vec3 vLight;
varying vec3 vEye;
varying vec3 vNormal;

void main (void)
{
    vec4 color = texture2D (Texture, gl_TexCoord[0].st);

    float intensity = max(dot_unorm(vLight, vNormal), 0.0);

    color.rgb *= vec3(intensity);
    gl_FragColor = color;
}

Instead of using a noise texture to perturb the normals, it would be faster to create a normalmap for the leaf texture in an offline process. Using a normalmap would only result in an extra texturesample and addition to be implemented.

Fd80f81596aa1cf809ceb1c2077e190b
0
rouncer 104 Aug 04, 2008 at 11:09

that tree looks really nice.

4f6bc850fb510ba557bd316e073da82a
0
jstier 101 Aug 04, 2008 at 16:46

I have to keep that dot_unorm in mind, because I could use that in a lot of other shaders. I wonder if it is truly faster though. Doesn’t the normalize operation run in hardware and is as such faster than several calls to dot and rqrt. I am just guessing here.

I will look into the normal map. It will bring some improvement, but the problem is also that all the leafs in the patch texture are parallel to the plane. In a real tree, the leafs are turned in all direction and hence reflect light differently. A normal map alone won’t do the trick. I actually have to create a different patch texture where some of the leafs are turned, but it will almost be impossible to create a normal map for that …

The trees are generated using the algorithm described in “Creation and Rendering of Realistic Trees” by Jason Weber. Just by changing a few parameters you can create vastly different trees. In large scenes the trees are rendered into billboards which are then used for different LODs.

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Aug 04, 2008 at 21:41

@jstier

I have to keep that dot_unorm in mind, because I could use that in a lot of other shaders. I wonder if it is truly faster though. Doesn’t the normalize operation run in hardware and is as such faster than several calls to dot and rqrt. I am just guessing here.

Not likely, because Normalize is actually a macro instruction, which result in 3 instructions: a dp3, rsq and mul. Newer GPUs can probably run some of these instructions in parallel, but to be sure, you should do some profiling.
@jstier

I will look into the normal map. It will bring some improvement, but the problem is also that all the leafs in the patch texture are parallel to the plane. In a real tree, the leafs are turned in all direction and hence reflect light differently. A normal map alone won’t do the trick. I actually have to create a different patch texture where some of the leafs are turned, but it will almost be impossible to create a normal map for that …

When you are rendering the leaves to billboards, couldn’t you just render the normals to a seperate texture, and apply that during shading?

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Aug 04, 2008 at 22:26

@Kenneth Gorking

Normalize is actually a macro instruction, which result in 3 instructions: a dp3, rsq and mul. Newer GPUs can probably run some of these instructions in parallel, but to be sure, you should do some profiling.

Are you sure? I thought at least on nVidia hardware, nrmh (for half-precision) is a one-cycle instruction, since the GeForce 6 series or so?

Of course, if GLSL doesn’t provide any way to access the half-precision datatype as Cg does, it’s a little moot…

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Aug 05, 2008 at 09:11

Pretty sure. The GeForce 6 has a special normalize unit that only works for half-precision, which allows fp16 normalizations to happen in parallel with other computations.

4f6bc850fb510ba557bd316e073da82a
0
jstier 101 Aug 05, 2008 at 13:18

When you are rendering the leaves to billboards, couldn’t you just render the normals to a seperate texture, and apply that during shading?

I will give that a try. Right now I am only rendering the entire tree to a billboard for LOD, but not the leafs - They are always just a texture, but the trunk is made of polygons.

I just realized that I can do the same thing for the leafs, by using geometry for close up leafs and billboards for the ones further away. This way I will also get the normals.

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Aug 05, 2008 at 16:26

@Kenneth Gorking

Pretty sure. The GeForce 6 has a special normalize unit that only works for half-precision, which allows fp16 normalizations to happen in parallel with other computations.

Right, so…he may be right about normalize running faster than a sequence of dots and rsqrt (for half-precision vectors anyway).

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Aug 05, 2008 at 21:41

@Reedbeta

Right, so…he may be right about normalize running faster than a sequence of dots and rsqrt (for half-precision vectors anyway).

They just might. Tough shit there are no 16-bit floats in glsl :lol:

Like I said earlier, profiling will reveal the details. I’m guessing older hardware (SM2) will run faster with the dot_unorm(), and newer hardware will be better off just using normalize() instead. Who knows, maybe the GF8-9/ATI-whatever series has native support for fp32 normalizations…