Jump to content


Particle System Performance


25 replies to this topic

#1 SmokingRope

    Valued Member

  • Members
  • PipPipPip
  • 210 posts

Posted 21 November 2009 - 04:19 PM

My question is how many particles you can (or would expect to) throw into a scene without serious performance degradation.

Right now i can throw about 600 into my game world before the frame rate starts dropping off below playable levels. It's not heavily optimized, or multithreaded yet, running on a 2.6 GHZ cpu and in a debug build. Adding collision detection to the particles perhaps drop totals down closer to 200.

#2 Vilem Otte

    Valued Member

  • Members
  • PipPipPip
  • 185 posts

Posted 21 November 2009 - 04:32 PM

600 is a little low number. You could be able to do so with at least several thousands of particles.

Also you can try some particle systems on GPU (with help of geometry shaders it is really easy and damn fast ... adding collision detection on GPU is on the other hand a bit tricky).
My blog about game development (and not just game development) - http://gameprogramme...y.blogspot.com/

If you don't know how to speed up application, go "roarrrrrr!", hit the compiler with the club and use -O3 :D

#3 starstutter

    Senior Member

  • Members
  • PipPipPipPip
  • 1039 posts

Posted 21 November 2009 - 04:39 PM

Do you mean on the CPU or GPU? On the CPU, it's totally dependant on how efficiently your particle system is or if you use multithreading. You don't need all that many particles though with all of todays fancy shader capabilities (soft particles cut the needed number to 1/4 to get the desired look). I can usually do about 3500 cpu based at reasonable speeds (only using CPU that is). That's obviously without physics applied to them. For rendering, are you using instancing? If not, you'll drain your CPU so fast you'll think you have a crippling error in your code.

The GPU is another story though. That is dependant on how much the particles overlap. Remember that every visually overlapping particle is another large batch of overdrawn pixels. Try to make sure that the particles are not stacked on top of eachother. If you need thick smoke, make them more opaque rather than adding more. This is something that just takes a lot of area/level polishing. I don't know of an elegant tech solution to this.

EDIT: oh, and it may seem trivial, but make sure that the particle data is always passed by reference or pointers (unless this absolutley can't be done). You would be stunned at how much of a speed difference this makes.
(\__/)
(='.'=)
This is Bunny. Copy and paste bunny into
(")_(") your signature to help him gain world domination.
bunny also wants to fight spam: Click Here Bots!

#4 TheNut

    Senior Member

  • Moderators
  • 1401 posts
  • LocationThornhill, ON

Posted 21 November 2009 - 06:21 PM

Make sure to pack all your particles into VBOs as well. Don't render them one at a time, otherwise your bottleneck will be state and function overhead. My netbook (1.6GHz Atom) easily handles several thousand with smooth frame rates.
http://www.nutty.ca - Being a nut has its advantages.

#5 SmokingRope

    Valued Member

  • Members
  • PipPipPip
  • 210 posts

Posted 21 November 2009 - 06:41 PM

I am using D3DXSprite and none of the perhaps more efficient vertex based solutions. All my particles end up using a single 16x16 animated texture with 4 frames.

This game is an experiment into component based design, and i've changed the method which looks up components so that it uses a simple array instead of std::list<> which has huge implications for performance across the app. This change in storage method plus adding an inlining at least doubled the total supported particle count.

Here's how it looks now.
	inline GameComponent *component(ComponentID pID)
	{
		return m_Components[pID];
	};

I also went into my rendering code, fixed a somewhat ill conceived sprite-rendering function which had both a cast from float-to-int and then from int-to-float on each sprite coordinate, and perhaps achieved better batching by calling D3DXSprite::begin() and D3DXSprite::end() only once per frame instead of on once a per-particle basis. (Does anyone know whether these methods actually govern when the data is sent to GPU?)

My latest weapon is a 600 particle flame type projectile, which the game can render about 3 in debug, and a continuous stream (screenshot must contain about 6000) in release builds with the mentioned optimizations.

Posted Image

I do desperately want to avoid optimizing too early though, as we all know premature optimization is evil.

EDIT: After some additional testing i can unleash 20 of these into a release build at once before performance drops, making for a total of 12000 particles. That number drops down to (perhaps a generous) 1800 particles (or 3 projectiles) with collision detection enabled.

#6 onyxthedog

    Senior Member

  • Members
  • PipPipPipPip
  • 467 posts

Posted 21 November 2009 - 07:13 PM

SmokingRope said:

I do desperately want to avoid optimizing too early though, as we all know premature optimization is evil.
I am by no means the best programmer, or even to close to the middle I would guess, but I have done enough to be able to tell you this. Premature optimization CAN be the root of all evil. (No, I am not going against Knuth, hear me out.) If you read the quote again, it says 97% of the time. This doesn't mean that in the planning stages you shouldn't try to be somewhat optimized, but rather trying to get the most of the system in the first go around is what he means, as I understand him. You should never let that rule get in the way of progress, that is like taking a rule of thumb and living by it as though it was the 10 commandments. If makes sense to optimize now or for some reason it would help to, do so.

In closing, I am not the best programmer, nor anywhere near that level, so take my advice with a grain of salt. Fellow programmers, if I am mistaken tell me, but in the same light, if I am correct tell me as well. Thanks
/* Perfect_day.c */
#include <arcade>
#include <computer>
#include <drinks>
#include <hardware/high_end>
#include <snacks>
#pragma <responisiblities>
...........

#7 Reedbeta

    DevMaster Staff

  • Administrators
  • 4782 posts
  • LocationBellevue, WA

Posted 21 November 2009 - 07:40 PM

I'd agree with onyx...the key word here is "premature". Knuth's dictum doesn't mean wait till the last month of the project to optimize anything, of course. :( As I understand it, the problem is spending time optimizing code when either (a) you don't know that the code causes a performance problem (because you haven't got good profiling data), or (:) the basic functionality and "mission" of that code is still in flux, so the code is likely to have to be rewritten down the line anyway. Of course, even in case (B) you may still have to optimize a bit - if the code performs so horribly that it's hindering further development. In short, apply optimization efforts in proportion to how much benefit they'll get you - both now and at the end of the project.
reedbeta.com - developer blog, OpenGL demos, and other projects

#8 SmokingRope

    Valued Member

  • Members
  • PipPipPip
  • 210 posts

Posted 21 November 2009 - 08:21 PM

In this scenario the performance of the physics and particle systems is still exceeding the demand placed on it by the game, and so to improve the performance at all is premature because the game does not need that performance. As far as the progress of the game and it's current design, there's absolutely no scenario where 6000 particles will be visible on screen at once, even though it is certainly nice to know that this will be feasible down the road should inspiration strike. :)

There are some optimizations I know will be effective in my sweep and prune algorithm i've decided to put off simply because they won't yield much benefit with the current scene complexity.

#9 JarkkoL

    Senior Member

  • Members
  • PipPipPipPip
  • 458 posts

Posted 22 November 2009 - 09:27 AM

Not spending time early in the project for optimization can cause you tons of wasted work too, particularly if you are not coding alone in a backroom closet ;) If there are e.g. artists involved in the project it's important that they have some reasonable idea about the final performance to scale the assets properly. It's not a good idea to postpone optimizations to the last stages and then say "oops, our engine isn't able to handle that 20,000 triangle model, could you cut it down to 5,000" or "we can't have 100 enemies running on the screen at once, so could you change the design to have only 30" (: The same applies to programming too, i.e. you can waste time in implementing 10 different features, but because of the performance constraints can't have more 5 of the features running thus have wasted time on implementing useless features.

#10 SmokingRope

    Valued Member

  • Members
  • PipPipPip
  • 210 posts

Posted 22 November 2009 - 06:16 PM

JarkkoL said:

If there are e.g. artists involved in the project it's important that they have some reasonable idea about the final performance to scale the assets properly.

That was the point of this question more than looking for advice on how to optimize my particle system.

What numbers have you seen an indie or triple A game set for particles in a scene?

I appreciate the numbers given by starstutter and TheNut however if others could provide similar feedback on their engines, and past projects it would give a more appropriate sample size for determining how adequate or inadequate the performance of my own (and others) particle systems are.

#11 imerso

    Senior Member

  • Members
  • PipPipPipPip
  • 426 posts
  • LocationBrasil

Posted 23 November 2009 - 12:32 AM

It clearly depends on the graphics card, but with my "old", middle range HD2600XT I can render like 5000+ particles with no degradation at all.

#12 MortenB

    New Member

  • Members
  • Pip
  • 1 posts

Posted 23 November 2009 - 01:41 AM

The most important aspect of getting the most out of your optimisations is knowing what to optimise. Get a good profiler and learn how it works. Both intel and AMD have good profilers. For GPU side optimisations, you can get good ones from the graphics card manufacturers.

#13 onyxthedog

    Senior Member

  • Members
  • PipPipPipPip
  • 467 posts

Posted 23 November 2009 - 05:08 AM

Just from what I've seen, as I haven't implemented one, 10,000 seems to be a decent number.
/* Perfect_day.c */
#include <arcade>
#include <computer>
#include <drinks>
#include <hardware/high_end>
#include <snacks>
#pragma <responisiblities>
...........

#14 starstutter

    Senior Member

  • Members
  • PipPipPipPip
  • 1039 posts

Posted 23 November 2009 - 06:26 PM

We should give some context to these stats though. Some people may be talking about an isolated particle system (10,000+ seems rather high for a true gameplay number). I was personally talking about one with all other effects and processing going on.
(\__/)
(='.'=)
This is Bunny. Copy and paste bunny into
(")_(") your signature to help him gain world domination.
bunny also wants to fight spam: Click Here Bots!

#15 Reedbeta

    DevMaster Staff

  • Administrators
  • 4782 posts
  • LocationBellevue, WA

Posted 23 November 2009 - 08:24 PM

Just to give one other data point: I loaded up Infamous and turned on our particle count diagnostic. After playing through a couple battles with heavy use of particles, I saw a high-water mark of around 60,000 - 70,000 live particles at a time. This is an exceptional case, though, about the highest count of any point in the game (and I can't be sure we maintained 30Hz through there either).
reedbeta.com - developer blog, OpenGL demos, and other projects

#16 onyxthedog

    Senior Member

  • Members
  • PipPipPipPip
  • 467 posts

Posted 24 November 2009 - 02:15 AM

I got my number from Linux Game Programming, a book I found on the 'net and the fact that Elysian Shadows team mentions it in one of their videos. (One of the earlier one's I think, but could be wrong about when.) But like I said, that is just from hearing/reading.
/* Perfect_day.c */
#include <arcade>
#include <computer>
#include <drinks>
#include <hardware/high_end>
#include <snacks>
#pragma <responisiblities>
...........

#17 JarkkoL

    Senior Member

  • Members
  • PipPipPipPip
  • 458 posts

Posted 24 November 2009 - 05:04 AM

With particles it's not the count that matters but the size of the particles (and thus fillrate)

#18 .oisyn

    DevMaster Staff

  • Moderators
  • 1810 posts

Posted 24 November 2009 - 09:50 AM

JarkkoL said:

With particles it's not the count that matters but the size of the particles (and thus fillrate)
Well obviously count matters a lot when you want them to collide with the geometry :(
C++ addict
-
Currently working on: the 3D engine for Tomb Raider.

#19 JarkkoL

    Senior Member

  • Members
  • PipPipPipPip
  • 458 posts

Posted 24 November 2009 - 09:51 AM

... which you rarely want ;)

#20 poita

    Senior Member

  • Members
  • PipPipPipPip
  • 322 posts

Posted 24 November 2009 - 10:02 AM

JarkkoL said:

... which you rarely want ;)

Depends on the game.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users