What part of gaming requires the bulk of processing power?

54e97bcc4e4a2fc8f5f05594afd1683a
0
Gnarlyman 101 Mar 15, 2012 at 00:08

I’m very curious. In this age of manic polygon counts and competition therein between developers, and also a growing bit of interest in data methods such as octrees…

What does indeed constitute the bulk of what the CPU and GPU have to expend their resources on? Is it really the number ITSELF of polygons, or is it all the trigonometry and other math involved in calculating each triangle (I know that sounds almost identical, but there’s a slight but meaningful difference there)? Or is it the texture mapping? Or something else? Exactly what IS the thing(s) that requires the bulk of the focus/work from the CPU, GPU, memory, etc.?

15 Replies

Please log in or register to post a reply.

Ceee4d1295c32a0c1c08a9eae8c9459d
0
v71 105 Mar 15, 2012 at 10:49

The crossing line between cpu and gpu is fading, with languages like cuda, basically the gpu today can almost do the some amount of work the cpu can.
I think that the next few years will se a lot of algorithm fo computational visibility problems solved in the gpu, remains to be seen what will the cpu be used for. I am seeing a trend to move everything in the gpu, basically becasue of high parallelism.Texture mapping has a rather finite scope, i would like to see more research in voxel , i dream of a world made up of little atomic instances so you can interact with them , without specifing a surface and a texture in separate contexts.
The main problem is memory, but you never known what’s around the corner. Recently i have read an academic paper about a new kind of technology for data reading and writing using femtosecond pules of heat, the ratio of data / surface was huge comapred to a current hard disk.
Let’s pretend that in the future the difference between accessing memory and mass storage will be comparable , and we are going to see far more than gpu and cpu usage. A huge octree storing voxels rather than traingles, accessed directly from the solid state hard drive,….yummy

6aa952514ff4e5439df1e9e6d337b864
0
roel 101 Mar 15, 2012 at 12:55

Your title mentions “gaming”, but your post only considers graphics. Graphics are quite often a part of a game, but a game is more than just graphics. And your question cannot be answered in general; a chess game is a lot AI and a little graphics for example.

54e97bcc4e4a2fc8f5f05594afd1683a
0
Gnarlyman 101 Mar 15, 2012 at 19:17

I see. Thanks for the input. Really, I guess I’d like to know what’s more taxing: the number of polygons or the the required calculations on them?

V71: I’m sooooo with you on voxels. I find them to be am absolutely killer deal. I’m working with a bunch of voxel conceptsbmyself recently. I truly hope they become the future, and that in shirt order too. I think it can be done as well.

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 168 Mar 15, 2012 at 21:26

“The number of polygons” vs “the required calculations on them” makes no sense to me as a distinction. What computers spend their time doing is calculations. Data takes no time if you don’t do anything with it! :)

There are actual, useful distinctions that can be made, though. I’ll quote part of an answer I posted on GameDev StackExchange about it:

Performance of graphics hardware is a complex topic. It is a concurrent, pipelined system that contains many different processes operating simultaneously and passing data to one another.

In general, the speed of the whole system is controlled by the speed of the slowest component. At different times and under different circumstances, different components may be the bottleneck. For instance, we often render triangles by running a vertex shader followed by a pixel shader. If the vertex shader is slower than the pixel shader, the speed of the whole rendering will be determined by how fast the vertices can be done, and we say the system is “vertex-bound” or “vertex-limited”. Or if the pixel shader is slower, we say it’s “pixel-bound”, etc.

Within shaders we can also look at computation versus memory access. Sometimes a shader may be compute-bound, meaning it has enough math operations to hide the latency of memory reads (texture sampling) and writes (output to a render target). In other cases the shader may be bandwidth-bound, meaning that it needs to wait for memory access to complete, so memory bandwidth (how fast you can read and write memory) is the bottleneck. Within bandwidth-bound cases we can distinguish being bound by input bandwidth (texture bound), or by output bandwidth.
The texture-bound case is the one you’re asking about. If a shader is texture-bound, then switching to a texture format twice as wide will make it render approximately half as fast. However, switching to a thinner texture format might not make it go much faster, if it moves the bottleneck somewhere else.

Conversely, if the shader is not texture-bound, switching to a thinner format will have no effect on the overall speed, since the bottleneck is somewhere else. But switching to a wider format might cause the shader to become texture-bound and slow it down.
In conclusion, there’s no fixed relationship between texture format and rendering speed. Thinner texture formats will be no slower than wider ones, but they might not be faster either - it depends on details of the shader and the scene.

I focused on texture formats there because that’s what the guy was asking about, but the same applies to pretty much any variable that could affect performance, such as the numbers of vertices or triangles.

B5262118b588a5a420230bfbef4a2cdf
0
Stainless 151 Mar 16, 2012 at 10:01

You also have to consider the platform as well.

The GPU is incredibly useful, though it lacks one key ability. If the chip manufacturers can work out a way of doing random access textures so we can treat them as RAM, then the CPU suddenly becomes far less important.

Physics chips exist, if they become more common then that’s one less thing for the CPU to worry about.

The slowest part of a game, is game dependant. If you are doing a beautiful 3D landscape with soft shadows, light rays, and all the other toys, then that’s were the most t states are going to be used.

If you are doing a simple sprite game, but with complicated physics, then the physics engine is going to burn the most t states.

If you are doing a board game, like chess, then the AI is going to burn t states like charcoal at the superbowl car park.

The key to the problem is looking at the game design and working out where the bottleneck will be, then spending a lot of time coming up with a good design for that part of the game.

Then finish the game (god, three words that represent something that most people just cannot do :) ) and if it’s still running slow, profile.

Profiling is so important and often shows problems that you didn’t even consider. For instance in C# the garbage collector is rubbish, if you create a lot of garbage then your game will stutter.

54e97bcc4e4a2fc8f5f05594afd1683a
0
Gnarlyman 101 Mar 16, 2012 at 15:08

Thanks Reedbeta and Stainless. Both posts were helpful. One of the things I’m trying to get a read on is issues like the 360 and the PS3 being able to handle different amounts of polygon data, amongst other things. I do realize that “polygon count” and “calculations” are usually considered synonymous for obvious reasons, but I wanted to ask anyways.

B5262118b588a5a420230bfbef4a2cdf
0
Stainless 151 Mar 18, 2012 at 09:47

We talk about “Budgets”.

You have a polygon ‘budget’ and have to split it amongst the background and characters, it’s a pain but it does help when you have a lot of people working on a project.

I don’t know much about the PS3, but the 360 has a few quirks that you have to plan for.

If you are writing for XNA, then you have a download limit to worry about. You also only have 512Meg of ram to play with.

If you are shipping on dvd, then obviously the download limit is not an issue, but the memory issue remains.

I tend to load all of my content at the start of the game, that way I now if I run out of storage very quickly and can do something about it.

It also means you have no loading screens once the game actually gets up and running.

Good luck, have fun.

8676d29610e6c98d6dd2d9c38528cd9c
0
alphadog 101 Mar 22, 2012 at 13:23

@v71

I am seeing a trend to move everything in the gpu, basically becasue of high parallelism.

GPU vs CPU (or GPU vs PPU) is not really a huge, fundamental difference, but rather computation devices with different design goals, esp. now with parallel cores in the average CPU. (In fact, I remember a paper that showed that, with optimized code, an i7 could perform about the same as a GTX200 of some sort.) Graphic computation, of course, is (relatively) highly parallelizable, so they benefit greatly from a GPU specifically architected with those goals in mind, but a cutting-edge CPU with highly optimized code will not be multiple orders of magnitude off.

What I think we’ll see is PU’s with lots of different cores. Some will favor parallelism, some will favor large, local caches, some will favor lower power/heat envelopes.

Stainless is right, the part of a game that requires the most computational power is highly dependent on the type of game.

Fe8a5d0ee91f9db7f5b82b8fd4a4e1e6
0
JarkkoL 102 Mar 22, 2012 at 21:20

GPU’s actually got currently an order of magnitude more processing power than CPU’s, which is why the trend to move computationally intense stuff to GPU’s, and this gap will only increase in the future. Current high-end CPU’s are around 100 GFLOP mark while GPU’s are at 1 TFLOP.

B5262118b588a5a420230bfbef4a2cdf
0
Stainless 151 Mar 23, 2012 at 13:28

Yes, but that’s not the whole story. It’s being able to apply that brute force to a problem.

When you are using textures as arrays of values, rather than just as they were designed to be used, then you hit issues.

In ram I can do this …

    array[address1]+=dx;
    array[address2]-=dx;

To do the same thing in a shader is very difficult.

For some problems (like fluid simulation) this can mean that you are doing several passes and dragging textures backwards and forwards across the bus all the time. Very slow compared to the raw speed of the GPU just running some code.

Another thing shaders (currently) cannot do is the equivalent of this.

   array[position1]=array[position2];

Textures are not read/write once in the GPU.You can read OR you can write. Not both.

If the chip designers get around this, well we are going to see some really fast and freaky effects :D

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 168 Mar 23, 2012 at 16:15

In Shader Model 5 (D3D11) there are read-write textures as well as other kinds of more generic buffers, as shown here. And I believe CUDA/OpenCL have been able to do what you suggest for years. :)

Fe8a5d0ee91f9db7f5b82b8fd4a4e1e6
0
JarkkoL 102 Mar 23, 2012 at 16:45

@Stainless

If the chip designers get around this, well we are going to see some really fast and freaky effects :D

Then you will be positively surprised if you look into DirectCompute for example, since compute shaders support random access writes (: You’ll have to be careful with memory access to write efficient code though and efficiently utilize local data share.

B5262118b588a5a420230bfbef4a2cdf
0
Stainless 151 Mar 23, 2012 at 20:38

:wub:

//// BOLLOCKS ////

I’m working in opengles now and can’t use the new toys…. :(

54e97bcc4e4a2fc8f5f05594afd1683a
0
Gnarlyman 101 Apr 03, 2012 at 21:49

Wow, this is some cool stuff going on here. Def getting a bit larger of an education with my original Q than I had bargained for. Thanks y’all.

8676d29610e6c98d6dd2d9c38528cd9c
0
alphadog 101 Apr 04, 2012 at 16:31

@JarkkoL

GPU’s actually got currently an order of magnitude more processing power than CPU’s, which is why the trend to move computationally intense stuff to GPU’s, and this gap will only increase in the future. Current high-end CPU’s are around 100 GFLOP mark while GPU’s are at 1 TFLOP.

GPU’s actually got currently an order of magnitude morelimited-type processing power than CPU’s. Only FLOP-heavy “ computationally intense stuff” should be moved over to the GPU.

They can’t do general computing like CPUs.