I'm looking for a fast way to remove duplicate vertices, based on position, from a list?
I've heard a hash table is a good way to do it, but I never used it before.
Looking for suggestions on how to do it.
Removal of duplicate vertices?
Started by Lost, Jan 01 2007 05:15 AM
7 replies to this topic
#1
Posted 01 January 2007 - 05:15 AM
#2
Posted 01 January 2007 - 02:21 PM
Hash tables are indeed the best datastructure for that. How to use them? Take a search on wikipedia. The article on hash-tables is a good one.
You will need to pick a hash function for your hashtable. Here is something that I've used with good results:
The idea here is to take the bitpatterns of the 3 floats, multiply each one with a prime (pick any primes you like..) and add them together. This gives good distributed hash-values even for problematic meshes. I've posted it because getting a good working hash function is the key to a good working hash-table, and it's not that easy to find something that works well for vectors. Of cause you can roll your own if you like.
Afterwards all you need to do is to modulo the result with your hash-table size and you'll get the index into the table.
You will need to pick a hash function for your hashtable. Here is something that I've used with good results:
typedef union
{
float f;
unsigned int i;
} intfloat;
unsigned int vectorhash (float x, float y, float z)
{
intfloat ix, iy, iz;
ix.f = x;
iy.f = y;
iz.f = z;
return ix.i*PRIME1 + iy.i*PRIME2 + iz.i*PRIME3;
}
The idea here is to take the bitpatterns of the 3 floats, multiply each one with a prime (pick any primes you like..) and add them together. This gives good distributed hash-values even for problematic meshes. I've posted it because getting a good working hash function is the key to a good working hash-table, and it's not that easy to find something that works well for vectors. Of cause you can roll your own if you like.
Afterwards all you need to do is to modulo the result with your hash-table size and you'll get the index into the table.
#3
Posted 01 January 2007 - 05:40 PM
Thanks, I just found some code here that seems to use that method:
http://codesupposito...04/welding.html
It looks like to me they use (1,11,-17) as the 'magic' numbers, but
1 and -17 aren't prime? What do you think they're doing here?
Other things I don't understand are &0x7fffffff and (f>>22)^(f>>12)^(f).
edit: ok, I see the MSB for floating point is the sign bit, and &0x7fffffff
clears it out, so it is always positive. Otherwise, you can have a different float-point
bit pattern for negative zero and postive zero, even though the value is the same.
Other stuff still eludes me. Must be hash-related.
http://codesupposito...04/welding.html
const unsigned int * h = (const unsigned int *)(&v); unsigned int f = (h[0] + h[1]*11 - (h[2]*17))&0x7fffffff; // avoid problems with +-0 return (f>>22)^(f>>12)^(f);
It looks like to me they use (1,11,-17) as the 'magic' numbers, but
1 and -17 aren't prime? What do you think they're doing here?
Other things I don't understand are &0x7fffffff and (f>>22)^(f>>12)^(f).
edit: ok, I see the MSB for floating point is the sign bit, and &0x7fffffff
clears it out, so it is always positive. Otherwise, you can have a different float-point
bit pattern for negative zero and postive zero, even though the value is the same.
Other stuff still eludes me. Must be hash-related.
#4
Posted 01 January 2007 - 07:58 PM
Another note: to save you the trouble of writing your own hash table code, if your STL supports it you can use the std::hash_set template. That's an SGI extension to the STL, but supported by most STL implementations (both Microsoft and g++, for instance).
reedbeta.com - developer blog, OpenGL demos, and other projects
#5
Posted 02 January 2007 - 11:27 PM
Thanks, but I think I still have to come up with my own hash key function for vertices,
whether I use STL or not?
Anyone who understands hash functions care to comment about that bit of code I posted?
About how (h[0] + h[1]*11 - (h[2]*17)) and (f>>22)^(f>>12)^(f) came about?
I'm thinking maybe it has to do with making postive and negative floats, like -2 and 2, unique, but still make -0 and +0 the same??
whether I use STL or not?
Anyone who understands hash functions care to comment about that bit of code I posted?
About how (h[0] + h[1]*11 - (h[2]*17)) and (f>>22)^(f>>12)^(f) came about?
I'm thinking maybe it has to do with making postive and negative floats, like -2 and 2, unique, but still make -0 and +0 the same??
#6
Posted 03 January 2007 - 12:24 AM
the factors 1, 11 and 17 are primes and used to build the basic hash out of the 3 elements. The bitwise AND with the constant just removes the sign bit from the float, so you'll get the same hash for +zero and -zero.
The reason why he subtract the h[2]*17 might be that he wants to bias his temporary float around zero, this gives the most precision for floats. Not strictly required for a hash, but not a bad idea either. It just avoids overflows or exponential shifts of the floats.
The xor-shift thing just mangles the bits around. He did that to mix up bits from the exponent and mantissa parts. This improves the randomness of the hash a bit.
The reason why he subtract the h[2]*17 might be that he wants to bias his temporary float around zero, this gives the most precision for floats. Not strictly required for a hash, but not a bad idea either. It just avoids overflows or exponential shifts of the floats.
The xor-shift thing just mangles the bits around. He did that to mix up bits from the exponent and mantissa parts. This improves the randomness of the hash a bit.
#7
Posted 04 January 2007 - 04:03 AM
Thanks for the explanation!
BTW, I don't believe the number 1 is a prime as stated by Wiki:
" Any natural number greater than 1..."
http://en.wikipedia....ki/Prime_number
Which is why I don't understand why they would use it. Maybe it doesn't matter.
BTW, I don't believe the number 1 is a prime as stated by Wiki:
" Any natural number greater than 1..."
http://en.wikipedia....ki/Prime_number
Which is why I don't understand why they would use it. Maybe it doesn't matter.
#8
Posted 04 January 2007 - 05:15 AM
Technically, 1 isn't a prime, but I suspect it only matters that the numbers chosen have no divisors in common other than 1, i.e. they are coprimes. Of course, 1 is coprime to any number, and prime numbers are always coprime to each other.
reedbeta.com - developer blog, OpenGL demos, and other projects
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












