# What part of gaming requires the bulk of processing power?

15 replies to this topic

### #1Gnarlyman

Valued Member

• Members
• 109 posts
• LocationMilwaukee

Posted 15 March 2012 - 12:08 AM

I'm very curious. In this age of manic polygon counts and competition therein between developers, and also a growing bit of interest in data methods such as octrees...

What does indeed constitute the bulk of what the CPU and GPU have to expend their resources on? Is it really the number ITSELF of polygons, or is it all the trigonometry and other math involved in calculating each triangle (I know that sounds almost identical, but there's a slight but meaningful difference there)? Or is it the texture mapping? Or something else? Exactly what IS the thing(s) that requires the bulk of the focus/work from the CPU, GPU, memory, etc.?

### #2v71

Valued Member

• Members
• 355 posts

Posted 15 March 2012 - 10:49 AM

The crossing line between cpu and gpu is fading, with languages like cuda, basically the gpu today can almost do the some amount of work the cpu can.
I think that the next few years will se a lot of algorithm fo computational visibility problems solved in the gpu, remains to be seen what will the cpu be used for. I am seeing a trend to move everything in the gpu, basically becasue of high parallelism.Texture mapping has a rather finite scope, i would like to see more research in voxel , i dream of a world made up of little atomic instances so you can interact with them , without specifing a surface and a texture in separate contexts.
The main problem is memory, but you never known what's around the corner. Recently i have read an academic paper about a new kind of technology for data reading and writing using femtosecond pules of heat, the ratio of data / surface was huge comapred to a current hard disk.
Let's pretend that in the future the difference between accessing memory and mass storage will be comparable , and we are going to see far more than gpu and cpu usage. A huge octree storing voxels rather than traingles, accessed directly from the solid state hard drive,....yummy
Check my code in the c/c++ section :
http://www.binpress.com/browse/c

### #3roel

Senior Member

• Members
• 698 posts

Posted 15 March 2012 - 12:55 PM

Your title mentions "gaming", but your post only considers graphics. Graphics are quite often a part of a game, but a game is more than just graphics. And your question cannot be answered in general; a chess game is a lot AI and a little graphics for example.

### #4Gnarlyman

Valued Member

• Members
• 109 posts
• LocationMilwaukee

Posted 15 March 2012 - 07:17 PM

I see. Thanks for the input. Really, I guess I'd like to know what's more taxing: the number of polygons or the the required calculations on them?

V71: I'm sooooo with you on voxels. I find them to be am absolutely killer deal. I'm working with a bunch of voxel conceptsbmyself recently. I truly hope they become the future, and that in shirt order too. I think it can be done as well.

### #5Reedbeta

DevMaster Staff

• 5309 posts
• LocationSanta Clara, CA

Posted 15 March 2012 - 09:26 PM

"The number of polygons" vs "the required calculations on them" makes no sense to me as a distinction. What computers spend their time doing is calculations. Data takes no time if you don't do anything with it!

There are actual, useful distinctions that can be made, though. I'll quote part of an answer I posted on GameDev StackExchange about it:

Quote

Performance of graphics hardware is a complex topic. It is a concurrent, pipelined system that contains many different processes operating simultaneously and passing data to one another.

In general, the speed of the whole system is controlled by the speed of the slowest component. At different times and under different circumstances, different components may be the bottleneck. For instance, we often render triangles by running a vertex shader followed by a pixel shader. If the vertex shader is slower than the pixel shader, the speed of the whole rendering will be determined by how fast the vertices can be done, and we say the system is "vertex-bound" or "vertex-limited". Or if the pixel shader is slower, we say it's "pixel-bound", etc.

Within shaders we can also look at computation versus memory access. Sometimes a shader may be compute-bound, meaning it has enough math operations to hide the latency of memory reads (texture sampling) and writes (output to a render target). In other cases the shader may be bandwidth-bound, meaning that it needs to wait for memory access to complete, so memory bandwidth (how fast you can read and write memory) is the bottleneck. Within bandwidth-bound cases we can distinguish being bound by input bandwidth (texture bound), or by output bandwidth.
The texture-bound case is the one you're asking about. If a shader is texture-bound, then switching to a texture format twice as wide will make it render approximately half as fast. However, switching to a thinner texture format might not make it go much faster, if it moves the bottleneck somewhere else.

Conversely, if the shader is not texture-bound, switching to a thinner format will have no effect on the overall speed, since the bottleneck is somewhere else. But switching to a wider format might cause the shader to become texture-bound and slow it down.
In conclusion, there's no fixed relationship between texture format and rendering speed. Thinner texture formats will be no slower than wider ones, but they might not be faster either - it depends on details of the shader and the scene.

I focused on texture formats there because that's what the guy was asking about, but the same applies to pretty much any variable that could affect performance, such as the numbers of vertices or triangles.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #6Stainless

Member

• Members
• 582 posts
• LocationSouthampton

Posted 16 March 2012 - 10:01 AM

You also have to consider the platform as well.

The GPU is incredibly useful, though it lacks one key ability. If the chip manufacturers can work out a way of doing random access textures so we can treat them as RAM, then the CPU suddenly becomes far less important.

Physics chips exist, if they become more common then that's one less thing for the CPU to worry about.

The slowest part of a game, is game dependant. If you are doing a beautiful 3D landscape with soft shadows, light rays, and all the other toys, then that's were the most t states are going to be used.

If you are doing a simple sprite game, but with complicated physics, then the physics engine is going to burn the most t states.

If you are doing a board game, like chess, then the AI is going to burn t states like charcoal at the superbowl car park.

The key to the problem is looking at the game design and working out where the bottleneck will be, then spending a lot of time coming up with a good design for that part of the game.

Then finish the game (god, three words that represent something that most people just cannot do ) and if it's still running slow, profile.

Profiling is so important and often shows problems that you didn't even consider. For instance in C# the garbage collector is rubbish, if you create a lot of garbage then your game will stutter.

### #7Gnarlyman

Valued Member

• Members
• 109 posts
• LocationMilwaukee

Posted 16 March 2012 - 03:08 PM

Thanks Reedbeta and Stainless. Both posts were helpful. One of the things I'm trying to get a read on is issues like the 360 and the PS3 being able to handle different amounts of polygon data, amongst other things. I do realize that "polygon count" and "calculations" are usually considered synonymous for obvious reasons, but I wanted to ask anyways.

### #8Stainless

Member

• Members
• 582 posts
• LocationSouthampton

Posted 18 March 2012 - 09:47 AM

You have a polygon 'budget' and have to split it amongst the background and characters, it's a pain but it does help when you have a lot of people working on a project.

I don't know much about the PS3, but the 360 has a few quirks that you have to plan for.

If you are writing for XNA, then you have a download limit to worry about. You also only have 512Meg of ram to play with.

If you are shipping on dvd, then obviously the download limit is not an issue, but the memory issue remains.

I tend to load all of my content at the start of the game, that way I now if I run out of storage very quickly and can do something about it.

It also means you have no loading screens once the game actually gets up and running.

Good luck, have fun.

DevMaster Staff

• Moderators
• 1716 posts

Posted 22 March 2012 - 01:23 PM

v71, on 15 March 2012 - 10:49 AM, said:

I am seeing a trend to move everything in the gpu, basically becasue of high parallelism.

GPU vs CPU (or GPU vs PPU) is not really a huge, fundamental difference, but rather computation devices with different design goals, esp. now with parallel cores in the average CPU. (In fact, I remember a paper that showed that, with optimized code, an i7 could perform about the same as a GTX200 of some sort.) Graphic computation, of course, is (relatively) highly parallelizable, so they benefit greatly from a GPU specifically architected with those goals in mind, but a cutting-edge CPU with highly optimized code will not be multiple orders of magnitude off.

What I think we'll see is PU's with lots of different cores. Some will favor parallelism, some will favor large, local caches, some will favor lower power/heat envelopes.

Stainless is right, the part of a game that requires the most computational power is highly dependent on the type of game.
Hyperbole is, like, the absolute best, most wonderful thing ever! However, you'd be an idiot to not think dogmatism is always bad.

### #10JarkkoL

Senior Member

• Members
• 475 posts

Posted 22 March 2012 - 09:20 PM

GPU's actually got currently an order of magnitude more processing power than CPU's, which is why the trend to move computationally intense stuff to GPU's, and this gap will only increase in the future. Current high-end CPU's are around 100 GFLOP mark while GPU's are at 1 TFLOP.

### #11Stainless

Member

• Members
• 582 posts
• LocationSouthampton

Posted 23 March 2012 - 01:28 PM

Yes, but that's not the whole story. It's being able to apply that brute force to a problem.

When you are using textures as arrays of values, rather than just as they were designed to be used, then you hit issues.

In ram I can do this ...
    array[address1]+=dx;


To do the same thing in a shader is very difficult.

For some problems (like fluid simulation) this can mean that you are doing several passes and dragging textures backwards and forwards across the bus all the time. Very slow compared to the raw speed of the GPU just running some code.

Another thing shaders (currently) cannot do is the equivalent of this.

   array[position1]=array[position2];


Textures are not read/write once in the GPU.You can read OR you can write. Not both.

If the chip designers get around this, well we are going to see some really fast and freaky effects

### #12Reedbeta

DevMaster Staff

• 5309 posts
• LocationSanta Clara, CA

Posted 23 March 2012 - 04:15 PM

In Shader Model 5 (D3D11) there are read-write textures as well as other kinds of more generic buffers, as shown here. And I believe CUDA/OpenCL have been able to do what you suggest for years.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #13JarkkoL

Senior Member

• Members
• 475 posts

Posted 23 March 2012 - 04:45 PM

Stainless, on 23 March 2012 - 01:28 PM, said:

If the chip designers get around this, well we are going to see some really fast and freaky effects
Then you will be positively surprised if you look into DirectCompute for example, since compute shaders support random access writes (: You'll have to be careful with memory access to write efficient code though and efficiently utilize local data share.

### #14Stainless

Member

• Members
• 582 posts
• LocationSouthampton

Posted 23 March 2012 - 08:38 PM

//// BOLLOCKS ////

I'm working in opengles now and can't use the new toys....

### #15Gnarlyman

Valued Member

• Members
• 109 posts
• LocationMilwaukee

Posted 03 April 2012 - 09:49 PM

Wow, this is some cool stuff going on here. Def getting a bit larger of an education with my original Q than I had bargained for. Thanks y'all.

DevMaster Staff

• Moderators
• 1716 posts

Posted 04 April 2012 - 04:31 PM

JarkkoL, on 22 March 2012 - 09:20 PM, said:

GPU's actually got currently an order of magnitude more processing power than CPU's, which is why the trend to move computationally intense stuff to GPU's, and this gap will only increase in the future. Current high-end CPU's are around 100 GFLOP mark while GPU's are at 1 TFLOP.

GPU's actually got currently an order of magnitude more limited-type processing power than CPU's. Only FLOP-heavy " computationally intense stuff" should be moved over to the GPU.

They can't do general computing like CPUs.
Hyperbole is, like, the absolute best, most wonderful thing ever! However, you'd be an idiot to not think dogmatism is always bad.

#### 1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users