A couple of ideas:

(1) use the automatic mipmap generation capabilities of the hardware to
generate the mean (you can read back the value of the topmost mip level
to get the mean)

(2) render the texture to a second texture using a shader that
calculates the squared difference at each pixel, and use automatic
mipmap generation again to generate the mean

(3) then you can read back the mean and take the square root.

BTW, all this falls under the category of general-purpose GPU computation (a phrase you may want to google). People have done much more complicated things with it, like performing simulations of cloth on the GPU using texels to store the position of vertices of the cloth surface, and shaders to update them at each timestep.

I am developing an application for work that deals with sensor performance. I need to calculate the standard deviation of an entire color buffer (1-channel 16-bit floating point frame buffer object) in Opengl. For testing purposes, I am simply reading up the buffer into main memory and calculating the standard deviation on the CPU. Of-course, as you might expect, this is a killer on the FPS.

I have googled and googled, searched every where but I cannot find any references or leads to this. The only reference that was mentioned was some article by Horn that talks about calculating the sum of an entire buffer using shaders, but I could not find the article itself.

My first idea is to simply use the non-programmable pipeline and do additive blending onto another 16-bit f.p. surface drawing verticle lines ontop of each other over an over again with the texture coordinates referencing the buffer for each verticle line, once for the number of pixels in the x direction (I may skip pixels to just approximate the s.d.), and then draw single points ontop of each other referencing the horizontal sums. This would give me the mean value, then I would repeat the process somehow using the differences minus the mean squared. This doesn’t seam terribly efficient but it would be better than reading it into the main memory. Also, I don’t know how to do this without having to ping-pong between textures when it comes to the second pass using the mean.

Anybody have better (more efficient) ideas for this?

Thanks in advance,

Odlantern