Jump to content


Shader Optimization


9 replies to this topic

#1 jstier

    New Member

  • Members
  • PipPip
  • 12 posts

Posted 03 August 2008 - 03:49 PM

Hi,


I am curious to see if anyone has some input on this shader for modeling tree leafs. Patches of leafs are made up of textured billboards that always face the viewer. The only input to the shaders is vertex attributes containing the coordinates for each corner of the billboard on the x-y plane. I wonder if anyone can see any performance improvements or has other comments.

The shaders are integrated with Geist3D. Anyone interested can download the Geist3D IDE and experiment with the shaders via the GLSL source code editor

The vertex shader positions the vertices of a billboard perpendicular to eye vector:

attribute vec2 gs_Attribute;


varying vec3 vLight;

varying vec3 vNormal;


void main()

{

	vec3 U = vec3(gl_ModelViewMatrixInverse[1][0],gl_ModelViewMatrixInverse[1][1],gl_ModelViewMatrixInverse[1][2]);

	vec3 V = vec3(gl_ModelViewMatrixInverse[0][0],gl_ModelViewMatrixInverse[0][1],gl_ModelViewMatrixInverse[0][2]);


	vNormal = vec3(gl_ModelViewMatrixInverse[3][0],gl_ModelViewMatrixInverse[3][1],gl_ModelViewMatrixInverse[3][2]);

	vLight = vec3(gl_ModelViewMatrixInverse * gl_LightSource[1].position);


	vec3 pos = gs_Attribute.x * V + gs_Attribute.y * U + gl_Vertex.xyz;

	gl_Position = gl_ModelViewProjectionMatrix * vec4(pos, 1.0);

	gl_TexCoord[0] = gl_MultiTexCoord0;

}	

The fragment shader simply maps a texture containing a few dozen leafs onto the quad and then performs phong shading. I think this could be improved a lot by perturbing the normals. Not all leafs in the patch always face the viewer with the same angle. I am thinking of using a noise texture to perturb the normals up to +- 90 degrees:

uniform sampler2D Texture;


varying vec3 vLight;

varying vec3 vEye;

varying vec3 vNormal;


void main (void)

{

	vec4 color = texture2D (Texture, gl_TexCoord[0].st);


	vec3 light = normalize(vLight);

	vec3 normal = normalize(vNormal);

	float intensity = max(dot(light, normal), 0.0);


	color.rgb *= vec3(intensity);

	gl_FragColor = color;

}


Here is a pic of what a tree looks like:

Posted Image
Check out Geist3D at www.geist3d.com

#2 Kenneth Gorking

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 04 August 2008 - 09:07 AM

It is some fairly simple code, but you can save a rsqrt in the fragment shader:

// Same as dot(normalize(A), normalize(B)), but with 1 rsqrt less

float dot_unorm(vec3 A, vec3 B)

{

	return dot(A,B) * rsqrt(dot(A,A) * dot(B,B));

}


uniform sampler2D Texture;


varying vec3 vLight;

varying vec3 vEye;

varying vec3 vNormal;


void main (void)

{

	vec4 color = texture2D (Texture, gl_TexCoord[0].st);


	float intensity = max(dot_unorm(vLight, vNormal), 0.0);


	color.rgb *= vec3(intensity);

	gl_FragColor = color;

}

Instead of using a noise texture to perturb the normals, it would be faster to create a normalmap for the leaf texture in an offline process. Using a normalmap would only result in an extra texturesample and addition to be implemented.
"Stupid bug! You go squish now!!" - Homer Simpson

#3 rouncer

    Senior Member

  • Members
  • PipPipPipPip
  • 2762 posts

Posted 04 August 2008 - 11:09 AM

that tree looks really nice.

#4 jstier

    New Member

  • Members
  • PipPip
  • 12 posts

Posted 04 August 2008 - 04:46 PM

I have to keep that dot_unorm in mind, because I could use that in a lot of other shaders. I wonder if it is truly faster though. Doesn’t the normalize operation run in hardware and is as such faster than several calls to dot and rqrt. I am just guessing here.

I will look into the normal map. It will bring some improvement, but the problem is also that all the leafs in the patch texture are parallel to the plane. In a real tree, the leafs are turned in all direction and hence reflect light differently. A normal map alone won’t do the trick. I actually have to create a different patch texture where some of the leafs are turned, but it will almost be impossible to create a normal map for that …

The trees are generated using the algorithm described in “Creation and Rendering of Realistic Trees” by Jason Weber. Just by changing a few parameters you can create vastly different trees. In large scenes the trees are rendered into billboards which are then used for different LODs.
Check out Geist3D at www.geist3d.com

#5 Kenneth Gorking

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 04 August 2008 - 09:41 PM

jstier said:

I have to keep that dot_unorm in mind, because I could use that in a lot of other shaders. I wonder if it is truly faster though. Doesn’t the normalize operation run in hardware and is as such faster than several calls to dot and rqrt. I am just guessing here.
Not likely, because Normalize is actually a macro instruction, which result in 3 instructions: a dp3, rsq and mul. Newer GPUs can probably run some of these instructions in parallel, but to be sure, you should do some profiling.

jstier said:

I will look into the normal map. It will bring some improvement, but the problem is also that all the leafs in the patch texture are parallel to the plane. In a real tree, the leafs are turned in all direction and hence reflect light differently. A normal map alone won’t do the trick. I actually have to create a different patch texture where some of the leafs are turned, but it will almost be impossible to create a normal map for that …
When you are rendering the leaves to billboards, couldn't you just render the normals to a seperate texture, and apply that during shading?
"Stupid bug! You go squish now!!" - Homer Simpson

#6 Reedbeta

    DevMaster Staff

  • Administrators
  • 5344 posts
  • LocationSanta Clara, CA

Posted 04 August 2008 - 10:26 PM

Kenneth Gorking said:

Normalize is actually a macro instruction, which result in 3 instructions: a dp3, rsq and mul. Newer GPUs can probably run some of these instructions in parallel, but to be sure, you should do some profiling.

Are you sure? I thought at least on nVidia hardware, nrmh (for half-precision) is a one-cycle instruction, since the GeForce 6 series or so?

Of course, if GLSL doesn't provide any way to access the half-precision datatype as Cg does, it's a little moot...
reedbeta.com - developer blog, OpenGL demos, and other projects

#7 Kenneth Gorking

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 05 August 2008 - 09:11 AM

Pretty sure. The GeForce 6 has a special normalize unit that only works for half-precision, which allows fp16 normalizations to happen in parallel with other computations.
"Stupid bug! You go squish now!!" - Homer Simpson

#8 jstier

    New Member

  • Members
  • PipPip
  • 12 posts

Posted 05 August 2008 - 01:18 PM

Quote

When you are rendering the leaves to billboards, couldn't you just render the normals to a seperate texture, and apply that during shading?

I will give that a try. Right now I am only rendering the entire tree to a billboard for LOD, but not the leafs - They are always just a texture, but the trunk is made of polygons.

I just realized that I can do the same thing for the leafs, by using geometry for close up leafs and billboards for the ones further away. This way I will also get the normals.
Check out Geist3D at www.geist3d.com

#9 Reedbeta

    DevMaster Staff

  • Administrators
  • 5344 posts
  • LocationSanta Clara, CA

Posted 05 August 2008 - 04:26 PM

Kenneth Gorking said:

Pretty sure. The GeForce 6 has a special normalize unit that only works for half-precision, which allows fp16 normalizations to happen in parallel with other computations.

Right, so...he may be right about normalize running faster than a sequence of dots and rsqrt (for half-precision vectors anyway).
reedbeta.com - developer blog, OpenGL demos, and other projects

#10 Kenneth Gorking

    Senior Member

  • Members
  • PipPipPipPip
  • 939 posts

Posted 05 August 2008 - 09:41 PM

Reedbeta said:

Right, so...he may be right about normalize running faster than a sequence of dots and rsqrt (for half-precision vectors anyway).
They just might. Tough shit there are no 16-bit floats in glsl :lol:

Like I said earlier, profiling will reveal the details. I'm guessing older hardware (SM2) will run faster with the dot_unorm(), and newer hardware will be better off just using normalize() instead. Who knows, maybe the GF8-9/ATI-whatever series has native support for fp32 normalizations...
"Stupid bug! You go squish now!!" - Homer Simpson





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users