Using Bit OPs in vs3.0 to pack normals
#1
Posted 30 September 2006 - 06:41 PM
I have to save 3 normal vectors per vertex for every mesh in my current project, but would like to reduce the memory consumption if possible.
Is it possible to use bit wise operations of V3 shaders (shift left/-right/and/or) to unpack 3 byte-normal vectors from one float-normal vector ? The vertex shader should then unpack this float-normal to the former 3 normal vectors.
Each float number could be used like
bit 0-7: byte-normal 1
bit 8-15: byte-normal 2
bit 15-23: byte-normal 3
after unpacking, byte should be converted back to float somehow
Is there any chance to achieve this on current hardware ?
I also thought about using a float-texture ( I think byte textures are not yet supported by the vertex shader ).
So far, I wrote everything in GL
cheers,:whistle:
#2
Posted 30 September 2006 - 10:21 PM
-
Currently working on: the 3D engine for Tomb Raider.
#3
Posted 01 October 2006 - 01:25 AM
#4
Posted 01 October 2006 - 06:59 AM
Is there something like this in OpenGL ?
Hm.. but I hoped I also could reduce the amount of used
texture coords for the deformation by packing. :unsure:
I just made an example in GL by using bitwise shift operations,
but it somehow couldnt be compiled, even I have NV GT6600,
newest driver and used cg 1.5 beta which should have the ps3.0 profile :sad:
Is there any flaw in the code, or is it possible that only DX9.c has ps3.0 ?
<CG Code>
C3E2v_Output C3E2v_varying(float2 position : POSITION,
int3 texCoord : COLOR)
{
C3E2v_Output OUT;
int r = texCoord.x & 255;
int g = (texCoord.x >> 8 )& 255;
int b = (texCoord.x >> 16 )& 255;
OUT.position= float4(position,0,1.0);
OUT.color = float3(float®/256.0,float(g)/256.0,float(b)/256.0);
}
<C Code, OpenGL>
glBegin(GL_TRIANGLES);
glTexCoord2i(0xff0000, 0);
glVertex2f(-0.8, 0.8);
glTexCoord2i(0x00ff00, 0);
glVertex2f(0.8, 0.8);
glTexCoord2i(0x0000ff, 0);
glVertex2f(0.0, -0.8);
glEnd();
#5
Posted 01 October 2006 - 03:05 PM
Quote
Quote
As for your code, I spottet 2errors:
1: 'texCoord' is declarared as an int3, should just be an int
2: 'texCoord' is mapped to COLOR semantic, but your code assigns the texture coordinate to TEXCOORD0.
Another optimization you might try, like .oisyn suggested, is to derive the third component of the normal in the shader, and pack the xy part along with the position.
// CG Code
C3E2v_Output C3E2v_varying(float4 position : POSITION)
{
C3E2v_Output OUT;
// Extract position
OUT.position = float4(position.xy, 0.0, 1.0);
// Derive normal
float3 normal;
normal.xy = position.zw;
normal.z = sqrt( 1 - normal.x*normal.x - normal.y*normal.y );
OUT.color = normal;
}
// C Code, OpenGL
glBegin(GL_TRIANGLES);
glVertex4f(-0.8, 0.8, 0.0, 0.0);
glVertex4f( 0.8, 0.8, 1.0, 0.0);
glVertex4f( 0.0, -0.8, 0.0, 1.0);
glEnd();
P.S. there is a new version Cg out you might wan't to try.
#6
Posted 22 October 2006 - 03:30 AM
Maybe I should do something like render to vertex-array, then I could use byte-textures..
#7
Posted 22 October 2006 - 02:07 PM
Render to vertex-array is only supported by Direct3D, unless youre hardware supports über-buffers.
#8
Posted 22 October 2006 - 04:56 PM
Kenneth Gorking said:
I'm not sure what you mean by "über-buffers", but OpenGL has an extension that provides render to vertex-array, and it's supported on (I believe) GF 6600 and up (not sure what the equivalent is in the ATI world now...)
#9
Posted 23 October 2006 - 05:02 PM
Reedbeta said:
Über-buffers, or SuperBuffers as they are also called, are basically just a generalization of the whole GL object thing where you use a memory object instead, that can be used as the memory of textures, framebuffers and vertex arrays. This means that you can render to the memory of a texture, and just attach that memory object to a vertex buffer, giving you render to vertex buffer. Read more...
I had read about pixel buffers, but they are not supported on my hardware so I never looked into them much. The functionality of render to vertex buffer can easily be simulated though, by rendering to a texture and then copying it to vertex buffer. Might not be as fast, but it should be easy to do.
#10
Posted 23 October 2006 - 09:04 PM
spacerat said:
You know, I tend to forget these kinds of technicalities, but the funny thing is, although I actually knew about this issue, I just realized why my quaternion-compression code for network packets I wrote ages ago didn't work like it was supposed to
-
Currently working on: the 3D engine for Tomb Raider.
#11
Posted 27 October 2006 - 12:11 AM
I'm now experimenting with FBO's and render to vertex-buffer for skinned animation.
So rendering a float tex to the FBO is quite fast- but when it comes to the glReadPixels( .. ) to copy the FBO buffer to my VBO, the framerate almost drops to the half (rgba) or it drops completely (rgb) !
If the VBO/Framebuffer format is RGBA (GL_FLOAT_RGBA32_NV) it drops from
138 fps (render VBO) to
108 fps (render float texture+render VBO) to
70 fps (render texture+copy to VBO+render VBO)
and in case of RGB (GL_FLOAT_RGB32_NV) its even worse:
:lol: 161 fps (render VBO) to
:happy: 116 fps (render float texture+render VBO) to
:angry: 8 fps !!! (render float texture+copy to VBO+render VBO)
(Did I run into some software emulation ?)
Is there any reason why the glReadPixels is so slow ?
Shouldnt it be as fast (or faster) than rendering the float texture?
At the moment I only can think of using the FBO texture in the vertex-shader as alternative.. Unfortunately I dont have an ATI card so I cant use render to vertex buffer in DX9.
here the copy to VBO source
glReadBuffer(GL_COLOR_ATTACHMENT0_EXT);
glBindBufferARB(GL_PIXEL_PACK_BUFFER_EXT, vbo_vertices_handle);
glReadPixels(0, 0, tex_width, tex_height, GL_RGB, GL_FLOAT, 0);
#12
Posted 27 October 2006 - 12:42 AM
#13
Posted 27 October 2006 - 12:56 AM
I found that Readpixels causes a glFinish to wait all op's to be finished. Maybe this could be a reason - however, this still cant explain the 8 fps for rgb.
#14
Posted 27 October 2006 - 01:45 AM
#15
Posted 27 October 2006 - 02:16 AM
#16
Posted 27 October 2006 - 02:29 AM
And the copy should be done by GPU.
At least, that is what is stated here: http://www.gpgpu.org/developer/
I wrote my code by looking at the GPU Particle demo's Render to vertex array class.
I also dont have an answer for the 8fps.
But I found half a solution. If I setup my FBO/VBO as rgb, and use readpixels with rgba, then its quite fast - however, the data seems to be crashed by doing this..
Is it not just 1:1 copy?
Here is another link:
http://oss.sgi.com/p...ffer_object.txt
#17
Posted 27 October 2006 - 02:59 AM
#18
Posted 27 October 2006 - 03:25 AM
#19
Posted 27 October 2006 - 02:26 PM
I'm not sure if they are provided by GLSL or HLSL, but they seem definitely useful
#20
Posted 27 October 2006 - 04:49 PM
Reedbeta said:
Btw, while glReadPixels is still a bad idea (due to the CPU/GPU interlock) the copy-performance is not as bad anymore. With PCI-Express cards readback is rather fast.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












