Hi All,
I woke up this morning, had a moment of inspiration, and figured out a cool way to partially parallelize force based physics simulations. I've been getting into writing articles/tutorials, so here you go:
http://tobeythorn.co...-Objects-R2.pdf
I believe that this method is a bit different than the one that appears in GPU Gems 3.
Peace,
Tobey
PS: My computer is old and not parallel and thus the absence of a working code sample.
Update: fixed a few minor mistakes in the article (and reduced the number of interactions that need to be calculated even further).
Parallelizing N-body Force-based Physics Simulations for Point-like Objects
Started by tobeythorn, Mar 03 2011 08:26 PM
5 replies to this topic
#2
Posted 06 March 2011 - 03:06 PM
Made a significant update to my n-body paper. It now includes better/bigger/more diagrams and I hope, a clearer explanation of the technique.
http://tobeythorn.co...-Objects-R2.pdf
Edit: To make a correction pointed out by EricTheRed, it does not run in O(N) time (for asymptotically big N), but rather could theoretically run in N time so long as there are at least N/2 cores.
http://tobeythorn.co...-Objects-R2.pdf
Edit: To make a correction pointed out by EricTheRed, it does not run in O(N) time (for asymptotically big N), but rather could theoretically run in N time so long as there are at least N/2 cores.
#3
Posted 07 March 2011 - 03:42 AM
tobeythorn said:
... it theoretically runs in O(N) time on highly parallel hardware.
The number of cores does not affect the complexity so does it also run in O(n) time in a single thread?
It's not what you know, it's what other people think you know.
Just hope you don't get quizzed on it.
Game engine design tutorials
Just hope you don't get quizzed on it.
Game engine design tutorials
#4
Posted 07 March 2011 - 02:42 PM
Hi Eric,
Maybe I do not correctly understand the exact meaning of complexity denoted by O-notation. What I've done is taken the N-Body force-interaction problem, which requires at least N(N-1)/2 forces to be calculated (typically in serial), and rearranged the them into groups such that it should be possible to calculate the forces N/2 at a time without any fighting over shared memory (supposing you have N/2 cores). Ideally, the time for computations should therefor be linear in N.
Yesterday I spent a bit of time reading about GPGPU and was disappointed to find that apparently GPUs are quite limited in the kinds of problems they are suited for (eg, one must avoid shared global memory as much as possible and one must break work into certain sized chunks depending on hardware, and transferring data to and from GPU is slow as well). Unfortunately, it seems that the parallelization trick I came up with may only be practical on systems with many CPUs (ie, supercomputer) or on specialized FPGA's.
Maybe I do not correctly understand the exact meaning of complexity denoted by O-notation. What I've done is taken the N-Body force-interaction problem, which requires at least N(N-1)/2 forces to be calculated (typically in serial), and rearranged the them into groups such that it should be possible to calculate the forces N/2 at a time without any fighting over shared memory (supposing you have N/2 cores). Ideally, the time for computations should therefor be linear in N.
Yesterday I spent a bit of time reading about GPGPU and was disappointed to find that apparently GPUs are quite limited in the kinds of problems they are suited for (eg, one must avoid shared global memory as much as possible and one must break work into certain sized chunks depending on hardware, and transferring data to and from GPU is slow as well). Unfortunately, it seems that the parallelization trick I came up with may only be practical on systems with many CPUs (ie, supercomputer) or on specialized FPGA's.
#5
Posted 07 March 2011 - 04:52 PM
Big-O notation is used to describe how the complexity increases asymptotically (i.e. as n gets very big). For your algorithm, it may run in O(n) if the number of cores is greater than n/2, but that is not the asymptotic behavior. That is one of the shortcomings of using big-O. Your algorithm is still O(n^2), but the constant factor is divided by the number of cores. This constant factor is an important part of the practical performance, but unfortunately cannot be expressed in big-O. Saying that the algorithm is highly parallelizable is accurate and will get your point across.
It's not what you know, it's what other people think you know.
Just hope you don't get quizzed on it.
Game engine design tutorials
Just hope you don't get quizzed on it.
Game engine design tutorials
#6
Posted 07 March 2011 - 05:04 PM
Eric,thanks for the explanation and correction. I understand the difference now and will edit my claims accordingly.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












