Jump to content


Vertex Size


7 replies to this topic

#1 idreamlovey

    Member

  • Members
  • PipPip
  • 91 posts

Posted 18 August 2007 - 05:19 AM

I have three types of vertex data as given below:-

struct D3DVERTEX

{

    D3DXVECTOR3 p;		 //12 bytes

    D3DXVECTOR3 n;		 //12 bytes

    FLOAT       tu, tv;	              //8 bytes

};//32 Byte stride

struct D3DVERTEX2T

{

    D3DXVECTOR3 p;		 //12 bytes

    D3DXVECTOR3 n;		 //12 bytes

    FLOAT       tu, tv;	              //8 bytes

    FLOAT       tu2, tv2;                     //8 bytes

    FLOAT       tu3, tv3;                     //8 bytes unnecessary for padding    to multiple of 64 bytes     

   

    FLOAT       tu4, tv4;                     //8 bytes unnecessary for padding  to multiple of 64 bytes     

     

    FLOAT       tu5, tv5;                     //8 bytes unnecessary for padding  to multiple of 64 bytes     

};//64 byte stride

struct D3DVERTEX3C

{

    D3DXVECTOR3 p;		//12 bytes			

    D3DCOLOR Color;		//4 bytes		

    FLOAT tu, tv;    		//8 bytes

    FLOAT tu2, tv2;		//8 bytes unnecessary for padding       purpose to multiple of 32 bytes

};//32 byte stride
My question is shall i use three VB for each above vertex types or one VB of given vertex type:-
 

struct D3DVERTEX

{

    D3DXVECTOR3 p;		 //12 bytes

    D3DXVECTOR3 n;		 //12 bytes

    D3DCOLOR Color;		 //4  bytes	

    FLOAT       tu, tv;	              //8  bytes

    FLOAT       tu2, tv2;                     //8  bytes

    FLOAT       tu3, tv3;                     //8  bytes unnecessary for padding    to multiple of 64 bytes     

   

    FLOAT       tu4, tv4;                     //8  bytes unnecessary for padding  to multiple of 64 bytes     

     

    D3DCOLOR Color2;                         //4  bytes unnecessary for padding  to multiple of 64 bytes     

};//64 byte stride


Here in second case u can clearly see the unnecssary memory consumption. In my first case the wastage of memory is limited as the padded data is not much say i have 1MB D3DVERTEX, 200-300kB D3DVERTEX2T and 100-200kB D3DVERTEX3C. But i have 3 VB switches. And in Second case, no VB switches but memory overhead and it simplifies the coding looks but it increases the amount of data to be transfered.
1.Which one u think should be good...or is there any other method to make the things better?...
2.Also i heard of multiple stream, what should i do to make in two streams without any conflictions?...

#2 Reedbeta

    DevMaster Staff

  • Administrators
  • 4782 posts
  • LocationBellevue, WA

Posted 18 August 2007 - 05:46 AM

I would say using three vertex streams would be better than wasting memory. A bit of increased code complexity is a small price to pay for making the CPU-to-GPU memory transfer as small as possible.

I don't know personally how to set up multiple streams in D3D, though I'm sure someone else will reply about that. But you might be able to avoid the switch() statements and write the code more cleanly using template tricks.
reedbeta.com - developer blog, OpenGL demos, and other projects

#3 idreamlovey

    Member

  • Members
  • PipPip
  • 91 posts

Posted 18 August 2007 - 10:09 AM

Thanks Reedbeta... i am thinking of it too...I read about the multistream in SDK documenation....and again i found my self on the drowning boat...As i come to know about the different state changes costs that are as follows:-

most-> least expensive
API Call Average number of Cycles
SetVertexDeclaration 6500 - 11250
SetFVF 6400 - 11200
SetVertexShader 3000 - 12100
SetPixelShader 6300 - 7000
SPECULARENABLE 1900 - 11200
SetRenderTarget 6000 - 6250
SetPixelShaderConstant (1 Constant) 1500 - 9000
NORMALIZENORMALS 2200 - 8100
LightEnable 1300 - 9000
SetStreamSource 3700 - 5800
LIGHTING 1700 - 7500
DIFFUSEMATERIALSOURCE 900 - 8300
AMBIENTMATERIALSOURCE 900 - 8200
COLORVERTEX 800 - 7800
SetLight 2200 - 5100
SetTransform 3200 - 3750
SetIndices 900 - 5600

and so on....
Now in my case it is clear that
Using multiple stream, i will improve the talking between GPU and CPU...But i have to do the additional SetVertexDeclaration 3 times and SetStreamSource 6 times...what do u say on it ?

#4 kusma

    Valued Member

  • Members
  • PipPipPip
  • 163 posts

Posted 18 August 2007 - 01:53 PM

Don't confuse function call overhead with state change overhead. Many GPUs have a fixed state-change overhead when you change whatever state is needed to regenerate an internal pixel shader for instance. D3D does buffering of state-changes and gives them to the driver in a batch. It is of course very dependent on the GPU/driver-architecture how it affects performance.

#5 idreamlovey

    Member

  • Members
  • PipPip
  • 91 posts

Posted 18 August 2007 - 02:04 PM

So Kusma , Your thumbs up to the multistream to save memory bandwidth...or shall i stick with the 3 vbs as i described my first case and reedbeta suggested....

#6 kusma

    Valued Member

  • Members
  • PipPipPip
  • 163 posts

Posted 18 August 2007 - 02:32 PM

idreamlovey: Not necessarily, different GPUs have different caching characteristics, so it's difficult to tell. My first instincts tell me to try to avoid having padding bytes, but I'm not sure how well different vertex processors would handle this. I guess this is something you should profile before you make a decision.

#7 idreamlovey

    Member

  • Members
  • PipPip
  • 91 posts

Posted 18 August 2007 - 05:39 PM

As i gone through the ATI optimization papers, they suggested that to do whatever to keep the vertex stream few as possible. It may causes the lower performance beacause of some unpredictable things, so i will go with my first choice and see what happens...Anyway thank u all for ur kind replies....

#8 idreamlovey

    Member

  • Members
  • PipPip
  • 91 posts

Posted 20 August 2007 - 04:11 AM

Here is the another solution i found that is i will use one VB but not specify its FVF at the time of its creation. I will dump the 32 bytes static data as usual i did before...At the time of rendering, i set its FVF according the material...Doing this i keep my code simple and made a jump of 10 fps extra while rendering 56k triangles with all effects....





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users