Jump to content


faster IsIdentity() ?


5 replies to this topic

#1 Nautilus

    Senior Member

  • Members
  • PipPipPipPip
  • 326 posts

Posted 09 November 2007 - 09:39 PM

Hello,
to check if a Matrix is an Identity I use this function:

bool IsIdentity (const Mat44& m)

{

    const DWORD* pFloat = (DWORD*) &m.m11;


    DWORD Sum = pFloat [1];


    Sum |= pFloat [2];

    Sum |= pFloat [3];

    Sum |= pFloat [4];

    Sum |= pFloat [6];

    Sum |= pFloat [7];

    Sum |= pFloat [8];

    Sum |= pFloat [9];

    Sum |= pFloat [11];

    Sum |= pFloat [12];

    Sum |= pFloat [13];

    Sum |= pFloat [14];


    if (Sum) return false;


    Sum  = pFloat [0];

    Sum += pfloat [5];

    Sum += pfloat [10];

    Sum += pfloat [15];


    // 0xFE000000 = 0x3F800000 * 4

    // 0x3F800000 = bitmask of 1.0f

    return (Sum == 0xFE000000);

}

I'm betting that the compiler will optimize it for me, because my asm is lame.

Up until now I've been happy with it, but lately I've been making heavy use of Matrix multiplications (due to my scene graph).
So in order to avoid useless mul's I must detect & ignore Identity matrices.

Now it's not that I have all that many Identity Matrices around... but I'm happier if I can speed up the above check.

Any tips?

Thanks In Advance,
ciao ciao : )
-Nautilus

(readin' this? perhaps you should get out more -- give it a thought)


#2 Reedbeta

    DevMaster Staff

  • Administrators
  • 4979 posts
  • LocationBellevue, WA

Posted 10 November 2007 - 01:15 AM

If you don't care about detecting matrices that are *nearly* the identity (from the above code, I'm guessing not :)), it might be faster simply to keep a spare identity matrix laying around in memory and just do memcmp() with that.

Another possibility is to do an equality check on each element of the array:
const DWORD one = 0x3f800000;
return (pFloat[0] == one) &&
       (pFloat[1] == 0) &&
       ...

I couldn't tell you off the top of my head whether either of these will actually be faster or not (it's surely architecture dependent too). You'll have to profile to see.
reedbeta.com - developer blog, OpenGL demos, and other projects

#3 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 10 November 2007 - 08:26 AM

Just hold a flag that tells if the matrix is identity, and update the flag only when the matrix changes.

#4 Nils Pipenbrinck

    Senior Member

  • Members
  • PipPipPipPip
  • 597 posts

Posted 10 November 2007 - 11:42 AM

The flag is by far the best solution. I use flags for matrices myself.

You can as well use SSE2 as shown here (loop unrolling highly recommended) http://groups.google...66c3a4d4e22eec8

If you want to stay away from asm you can still optimize your code a bit with some "useless in the real world micro-optimization":

Make sure you access your array elements sequential, even if you miss the chance for your early out test. This will improve memory access time a bit. The CPU likes to read sequential and will prefetch some data for you. The compiler will rearange the memory accesses around. This is not always what you want. Declaring the pointer as volatile will will help here but could have a negative effect as well (check your code - it might run faster without the volatile, that's compiler dependent).

Next break your long dependency chain into several smaller ones and merge the resuls at the end. The compiler could do this for you but I would hint the compiler into doing this (just in case). Breaking the long dependency chains is a nice thing to do for all superscalar and out of order CPU architectures and won't do any harm for dumb CPUs that still execute everything sequential.


This is my version. Could be like 5% faster or so..



bool IsIdentity (const Mat44& m)

{

    const volatile DWORD* pFloat = (DWORD*) &m.m11;


    DWORD Trc  = pFloat [0];

    DWORD Sum0 = pFloat [1];

    DWORD Sum1 = pFloat [2];

    DWORD Sum2 = pFloat [3];

		

    Sum0 |= pFloat [4];

    Trc   += pFloat [5];

    Sum1 |= pFloat [6];

    Sum2 |= pFloat [7];

		

    Sum0 |= pFloat [8];

    Sum1 |= pFloat [9];

    Trc   += pFloat [10];

    Sum2 |= pFloat [11];

		

    Sum0 |= pFloat [12];

    Sum1 |= pFloat [13];

    Sum2 |= pFloat [14];

    Trc   += pFloat [15];

		

    if (Trc != 0xFE000000) return false;

    if (Sum0 | Sum1 | Sum2) return false;

    return true;

}


My music: http://myspace.com/planetarchh <-- my music

My stuff: torus.untergrund.net <-- some diy electronic stuff and more.

#5 Nautilus

    Senior Member

  • Members
  • PipPipPipPip
  • 326 posts

Posted 10 November 2007 - 10:36 PM

Nils Pipenbrinck said:

The flag is by far the best solution. I use flags for matrices myself.
Where do you store the flag exactly: in an extra member of the Matrix, or do you borrow one bit from a specific member (say, m44)?

In my case inserting an extra member would break a lot of existing code.
So I'm almost thinking of borrowing the bit from m44, and setup the m44 member of my Mat44 as a Float union, with methods and operators to preserve compatibility with the expected float type.

Thanks for the help, everyone.
Ciao ciao : )
-Nautilus

(readin' this? perhaps you should get out more -- give it a thought)


#6 J22

    Member

  • Members
  • PipPip
  • 92 posts

Posted 11 November 2007 - 07:51 AM

I wouldn't store it into the matrix, but just as an extra member in your scene graph node. It would be tricky to keep the flag up to date if it was member of the matrix, if you for example got matrix operations which return references to matrix elements (e.g. operator[]). If it's part of the scene graph node you just need to make sure that matrix setting is properly encapsulated, i.e. don't return matrix references or such.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users