SwiftShader & Performance Counter

1b1be3e6a679f9e62120cf062f16e130
0
polyvertex 101 Apr 09, 2008 at 10:11

Hi Nick,

I’ve just downloaded the old open source code from the link provided by “juhana” in this thread to take a quick look on it.

I’m a bit surprised by the implementation of your Timer class…

I remember when I created my own Timer class for my 3D engine, I’ve end up with approximately the same straight-forward implementation :)
But I’ve rapidly got some weird timing effects (3D animations went fast, then slow, then fast, etc…), first I thought it was because of my crappy animation system but then I’ve found that tons of developers had the same problem as mine and it’s seems to happen on multi-core platforms only (ex : this GD thread).

I know this is old code so maybe the Timer class in today’s SwiftShader is no more the same but I just had to ask you the question : didn’t you had any side effect like mine on multi-core platforms with this implementation ?

Maybe it was so long time ago so you could not try it on a multi-core platform, thus my question would become totally irrelevant… :)

11 Replies

Please log in or register to post a reply.

99f6aeec9715bb034bba93ba2a7eb360
0
Nick 102 Apr 10, 2008 at 17:31

That class is no longer used (there’s nothing time dependent). But I’d love to know what you discovered exactly and whether you found better alternatives!

1b1be3e6a679f9e62120cf062f16e130
0
polyvertex 101 Apr 10, 2008 at 22:32

Errr, OK :)

Well, I did not “discovered” so much…
And I think I’d better not to feed the troll again about performance timing and profiling under Windows… :)

AFAIK, performance counters under Windows have 2 well-known problems :
* Too great leaps between 2 calls to QueryPerformanceCounter() (seems to be on every platforms, example here). Here is the MSDN KB article about it.
* The side effect I described due to the fact that each core has its own clock (so the problem occurs only on multi-core platforms, example here)

For clarity, here is “your” implementation (and basically the same as my old one) :

double Timer::seconds()
{
    __int64 currentTime;
    __int64 frequency;

    QueryPerformanceFrequency((LARGE_INTEGER*)&frequency);
    QueryPerformanceCounter((LARGE_INTEGER*)&currentTime);

    return (double)currentTime / (double)frequency;
}

Now, the most-common-and-recommended-trick :
There is an example implementation in OGRE 3D (Timer::getMilliseconds method at line 119).

As you can see, this is awful, but during my googling trip about this problem, I found nothing better for performance timing.

Maybe someone here use an another trick ?

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Apr 10, 2008 at 23:37

What if the thread the timer is running on runs at half the speed of the other threads?

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Apr 11, 2008 at 04:24

Kenneth: that should be accounted for in the value returned from QueryPerformanceFrequency, no?

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Apr 11, 2008 at 05:15

True, but I was thinking about synchronicity. If one subsystem is running twice as fast/slow as the others, couldn’t timing anomalies occur?

Consider this:

Thread 1 (1.5Ghz) : Rendering, AI
Thread 2 (3.0Ghz) : Physics, Audio

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Apr 11, 2008 at 05:45

Oh, definitely, if they’re running on different cores (with physically separate clocks) then even if the cores are at the same clock speed, timing anomalies WILL occur. That’s what the code polyvertex linked to is trying to account for, by setting the thread affinity mask so that it always calls QPC on the same core.

I don’t think you can have two threads on the same core running at different clock speeds…that wouldn’t make much sense to me.

1b1be3e6a679f9e62120cf062f16e130
0
polyvertex 101 Apr 11, 2008 at 06:23

Reedbeta is right about thread affinity and clock speed.

You could even call SetThreadAffinityMask(threadHandle, 1) to always stick to the first core but, as you can see, OGRE developers are going further by calling GetProcessAffinityMask() (see the Timer::reset() method, line 84) to be fully compliant with current OS and hardware settings, which seems to be a good thing…

I have to say that since I changed my Windows Timer class to use those tricks (constant thread affinity masking and potential leaps compensating), I never had timing problems anymore…

46407cc1bdfbd2db4f6e8876d74f990a
0
Kenneth_Gorking 101 Apr 11, 2008 at 07:16

@Reedbeta

I don’t think you can have two threads on the same core running at different clock speeds…that wouldn’t make much sense to me.

It was supposed to say core 1/2, my bad :)

1b1be3e6a679f9e62120cf062f16e130
0
polyvertex 101 Apr 11, 2008 at 09:01

Wow, I just found this.
Does somebody know what is this “/invisible/src/” URL thingy ? :)

Reedbeta, why did you suggested to the OP to convert LARGE_INTEGER to floating point here ?

3c5be51fdeec526e1f232d6b68cc0954
0
Sol_HSA 119 Apr 11, 2008 at 10:57

‘invisible’ seems to be one microsoft research project last updated in 2004..

A8433b04cb41dd57113740b779f61acb
0
Reedbeta 167 Apr 11, 2008 at 20:26

@polyvertex

Reedbeta, why did you suggested to the OP to convert LARGE_INTEGER to floating point here ?

I was just thinking that it would be simpler and more convenient to work with times in floating point format rather than 64-bit integers. However, Nils said later in that thread that he thought this was a bad idea because the roundoff errors could become significant. So, the timer should internally keep the time as a LARGE_INTEGER but then convert it to float when calculating the timestep for each frame (assuming your physics and logic routines use floating-point time).