# Random probability

17 replies to this topic

### #1Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 02:22 PM

Here is my problem...

I have X number of triangles, each with a different area (different size). I would like to randomly pick one from the list (that part is easy), BUT, those with a larger area must have a greater chance of being picked, and those with mallest area, a lesser chance of getting hit.

Any idea on how to do this? I googled around by found nothing!
Thanks

### #2Nautilus

Senior Member

• Members
• 351 posts

Posted 20 March 2012 - 03:01 PM

Here's a 5 minutes cheap solution until you come up with a more elegant one by yourself:

1) Precompute the area of each triangle.

2) Estabilish some arbitrary chance-factor to multiply by the computed area of each triangle (ideally this must give you an integer number that increases as the area of the triangle increases). You can also just reuse the very Triangle's area, if that ends up working for you (below you'll see why).

3) Assign an arbitrary numeric integer Id to each triangle (no two triangles must have the same Id).

4) Now prepare an array of Id's. In it you write a copy of each triangle's Id, based on the magnitude of the chance factor for the triangle in exam. For example, a triangle with Area = 10 might have 10 instances of its Id in the array, whereas a triangle with Area = 3 might have only 3 instances of its Id.

5) Randomly pick any of the Ids inside the array.

Simplicity is bliss.
Bigger triangles will naturally have more Ids in the array, and thus more chances to be picked. Conversely, smaller triangles will have less Ids in the array, and so less chances to be picked. I ignore how many triangles we're talking about here (you've been vague). If your list is reasonably big, you ought to find a better procedure than this. You'll probably want to, anyway. This is only good for a draft - or for a scenario wherein performance is irrelevant..
-Nautilus

(readin' this? you ought to get out more)

### #3Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 03:17 PM

That make sense thank you. Now that I've read your post, something comes to mind. Can I use the "Russian Roulette" based on the area? Say, I run a loop and if frand(0..1) < (TriangleArea/MaxArea) then pick that triangle else itterate?

### #4Nautilus

Senior Member

• Members
• 351 posts

Posted 20 March 2012 - 03:41 PM

Alienizer, on 20 March 2012 - 03:17 PM, said:

if frand(0..1) < (TriangleArea/MaxArea) then pick that triangle else itterate?

By doing so you process one triangle at a time. In that way the triangles that are processed first, do acquire more chances to be picked over the triangles with the same area (and that should have the same chances as the one being processed) further down in the 'list'.
Instead you want to pick 'one' out of the 'many'. To preserve the correct chance distribution you have to treat the many triangles as a whole.
-Nautilus

(readin' this? you ought to get out more)

### #5Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 04:00 PM

So then I can randomly pick one triangle, and if it pass the "Russian Roulette" test, keep it, otherwise repeat the process?

### #6geon

Senior Member

• Members
• 939 posts

Posted 20 March 2012 - 04:23 PM

Nautilus, with your method you get a very limited probability "resolution". A simpler way exists;

1. Precompute all triangle areas.
2. Let X = sum of all triangle areas
2. Loop over the list of triangles
4. For each one, the chance of picking it is the area/X
5. Subtract the area from X.

As you see, the chance of picking any one triangle is always proportional to the area divided by the ones left in the list. At the last triangle the probability would be 1. (Beware of float error build-up.)

### #7Reedbeta

DevMaster Staff

• 5305 posts
• LocationBellevue, WA

Posted 20 March 2012 - 04:53 PM

Even better way:

1. Precompute all triangle areas and the sum of areas
2. Precompute a list containing, for each triangle, the sum of all areas up to this triangle, divided by the total area of all triangles. So entry i in the list contains the sum of all triangles from 0 to i (inclusive). The list is monotonically increasing and the last entry is 1.
3. To pick a triangle, get a random value in [0, 1) and do a binary search on the above array, as if you were going to insert the random value in sorted order. The index you get back is the triangle to pick.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #8Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 08:16 PM

Reedbeta, on 20 March 2012 - 04:53 PM, said:

Even better way:

1. Precompute all triangle areas and the sum of areas
2. Precompute a list containing, for each triangle, the sum of all areas up to this triangle, divided by the total area of all triangles. So entry i in the list contains the sum of all triangles from 0 to i (inclusive). The list is monotonically increasing and the last entry is 1.
3. To pick a triangle, get a random value in [0, 1) and do a binary search on the above array, as if you were going to insert the random value in sorted order. The index you get back is the triangle to pick.

hmmm, this must be a very clever way Reedbeta, because I can't seem to understand the logic of it. Do you mind to elaborate on this please?

### #9Reedbeta

DevMaster Staff

• 5305 posts
• LocationBellevue, WA

Posted 20 March 2012 - 08:37 PM

It works by partitioning the [0, 1] interval into subintervals, each of which corresponds to a single triangle. The subintervals are sized proportional to the triangle's area. To pick a triangle, you pick a random number on [0, 1) and see in which interval it falls, hence you're picking a triangle with probability proportional to area.

To represent the intervals you only need a list of the points where you cross from one interval to the next, since they are a partition (they cover the whole unit interval). This is the array constructed in step 2. To see in which interval any random value X falls, you search for the smallest element in the array that is greater than X. This can be done with binary search since the array is sorted in ascending order. If you pretend you're going to insert X into the array, it's equivalent to finding the index at which you'd put it to maintain the sort.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #10Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 09:11 PM

So the list is the same size as the number of triangles? I'm still lost as to how this works Reedbeta, I'm pulling my hair on this one. oh boy you got me thinking hard LOL

### #11Reedbeta

DevMaster Staff

• 5305 posts
• LocationBellevue, WA

Posted 20 March 2012 - 09:53 PM

Yeah, the list is the same size as the number of triangles.

If the triangles' areas are 1, 4, 2, and 3, then the list would be 1, 5, 7, 10 - each entry is the sum of all the preceding triangles' areas. Then divide by the total area (10) so it would be 0.1, 0.5, 0.7, 1.0.

Then when you pick a random value between 0 and 1, 10% of the time it will be < 0.1, 40% of the time it will be between 0.1 and 0.5, 20% of the time between 0.5 and 0.7, and 30% of the time between 0.7 and 1.0. The probabilities are proportional to the areas.

For example, if the random value you pick is 0.567, that falls in the third interval (between 0.5 and 0.7), so you'd pick the third triangle. To find that, you'd do a binary search for 0.567 in the list 0.1, 0.5, 0.7, 1.0, and it should return index 2, which is where you'd insert if you were going to insert 0.567 into the list.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #12Alienizer

Member

• Members
• 435 posts

Posted 20 March 2012 - 10:21 PM

oh I get it now! But should I sort the list to 1,2,3,4 first as oppose to 1,4,2,3?

### #13Reedbeta

DevMaster Staff

• 5305 posts
• LocationBellevue, WA

Posted 20 March 2012 - 10:35 PM

No, the triangles don't need to be sorted by area to begin with. The list you build in step 2 will always come out sorted, no matter what, because you're adding items together to build it. Sorting the triangles from smaller to larger area neither hurts nor helps.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #14Alienizer

Member

• Members
• 435 posts

Posted 21 March 2012 - 12:03 AM

I got it, make sense. Thanks again Reedbeta, I really appreciate your help and time. Thanks to the others who helped as well.

### #15Reedbeta

DevMaster Staff

• 5305 posts
• LocationBellevue, WA

Posted 04 April 2012 - 06:54 PM

I just saw this article on AltDevBlogADay that describes a potentially even faster method for doing this sort of thing, although it requires a more complicated preprocess and more storage.
reedbeta.com - developer blog, OpenGL demos, and other projects

### #16Alienizer

Member

• Members
• 435 posts

Posted 05 April 2012 - 02:39 AM

Interesting! it's kind of neet, and almost like what Nautilus suggested.

### #17roel

Senior Member

• Members
• 698 posts

Posted 05 April 2012 - 08:34 AM

Reedbeta, on 04 April 2012 - 06:54 PM, said:

I just saw this article on AltDevBlogADay that describes a potentially even faster method for doing this sort of thing, although it requires a more complicated preprocess and more storage.
I visited this topic for exactly the same reason

### #18gooncorp

Member

• Members
• 73 posts
• LocationBay Area, California

Posted 13 April 2012 - 04:55 PM

stochastic expressions are a big part of my technique
Less talk more chalk.