VBO multistream vs. single stream

Snoob 105 Jul 30, 2013 at 05:44 opengl optimization opengl-es

Actually I try to optimize my mesh object handling and model importing. In the last few days I read different articles about geometry handling and VBO value streaming. Actually I see two possible solutions:

  1. Create a single VBO with any data for a geometry shape. In this case I’ve to handle multiple Vertex object structures to handle any possible case of geometry shape representation (i.e. solid, textured, unlit, lit, PPL ect.)
  2. Create multiple VBO’s to perform multistreaming. Based on an older DX GPU Gems article I could split those vbos to different streams which handle the different values of geometry shape representation and only activate them, if required (i.e. GeometryStream, TextureStream, LitStream …).

Article: http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter05.html

Based on your development experience I would request: Which way would you prefer and go and why? Which pros and contras you see on this solutions? Maybe have you any better solution?

8 Replies

Please log in or register to post a reply.

TheNut 179 Jul 30, 2013 at 14:00

I multistream simply due to its greater flexibility. It’s easier to represent a mesh as its individual components in your framework and have your shader host code pick and chose what it needs to render on the GPU. It also has the benefit of being the most efficient with dynamic vertex buffers such as updating particles or performing CPU based skeletal animation since you only have to update the vertex buffer (and normal buffer in the case with skeletal animation), while leaving the other buffers alone. From an efficiency standpoint however, working with single streams does have the benefit of making fewer function calls at render time, which may or may not be influential. Rendering thousands of tiny buffers may hinder performance rather than rendering them with a single buffer (if possible).

rouncer 103 Jul 30, 2013 at 16:03

I use multistream primarily for high speed instancing, thats the main use.

Snoob 105 Jul 31, 2013 at 05:02

Thanks for your advises, I will give the multistream buffers a try.

Sol_HSA 119 Jul 31, 2013 at 05:30

Conventional wisdom says that interleaving data is faster, but I’ve never seen any actual benchmarks proving this, and I know for a fact that there are architectures that are clearly faster with separate streams.

Reedbeta 167 Jul 31, 2013 at 07:17

Separate streams are also handy when different rendering passes use different subsets of the vertex attributes. For instance, shadow mapping usually only requires positions, not UVs or normals, etc. Putting the positions in their own stream can speed up the shadow passes since they’re typically vertex-bound and you get better cache locality with all the positions stored together. On PS3 we had the positions in their own stream and everything else in a second stream; it was an important optimization (though I don’t remember the numbers unfortunately).

Stainless 151 Jul 31, 2013 at 08:03

On some chip sets, particularly one’s used by Broadcom, VBO’s are essential.

I wrote a simple GLES2 renderer which drew my scene at 200 fps on my laptop, dropped it onto the Broadcom box and it ran at 12 fps

Just added VBO’s and I got to 240 fps on the laptop, and 305 fps on the Broadcom.

I have also noticed that separating the positions into their own stream is a speed increase.

Snoob 105 Jul 31, 2013 at 10:03

Thanks for the hints. How many streams would you use/prefer? Actually I prefer 4 separated streams:

  1. VertexStream (Position, VertexColor(s))
  2. TextureStream (Texture coordinates 0-7)
  3. LitStream (Normal, Tangent, Binormal)
  4. IndexBuffer

So I could render following combinations: 1, 1+2, 1+3, 1+2+3 and same combinations also with 4.

Vilem_Otte 117 Jul 31, 2013 at 10:27

I use separate streams also, interleaving is good - but as mentioned for shadow mapping you don’t need positions, etc. And even if interleaved streams would be faster than separate streams, the performance hit would be so small that it is hidden by far more expensive stuff (at least in my renderer).