I need to optimize some code.
Speed is critical, and I'm out of ideas.
If you are interested, please read.
Bottleneck - Part #1:
Given a 32 bits BitMask, I first need to test if certain bits are ON.
Bits of interest are [0-26].
At least 1 of them is always ON.
Any number of bits can be simultaneously ON.
Any combination using bits [0-26] is valid.
Bits [27-31] are always OFF (therefore ignored).
// Test for early bail out: we check if 1 and only 1 bit is ON.
// Tokens "BIT_0" through "BIT_26" are macro in the form:
// #define BIT_13 (0x00002000UL)
switch (BitMask) // BitMask is a DWORD.
{
case BIT_13: return 1;
case BIT_0: case BIT_1:
case BIT_2: case BIT_3:
case BIT_4: case BIT_5:
case BIT_6: case BIT_7:
case BIT_8: case BIT_9:
case BIT_10: case BIT_11:
case BIT_12: case BIT_14:
case BIT_15: case BIT_16:
case BIT_17: case BIT_18:
case BIT_19: case BIT_20:
case BIT_21: case BIT_22:
case BIT_23: case BIT_24:
case BIT_25: case BIT_26: return -1;
}
// At this point, 2 or more bits are ON.
// Last chance for early bail out: if BIT_13 is ON, we return.
if (BitMask & BIT_13) return 0;
// Out of luck. More tests are needed.
// "Bottleneck - Part #2" will continue from here...
I can't think of a better way to write the above. Is there one?Bottleneck - Part #2:
The tests here take the same BitMask from above and first check if
a specific bit is ON. In case it is, they then check if any bit in a
specific combo of bits is ON.
The combo can hold from 4 to 17 bits, it doesn't matter since I pack
them bits into a single mask, like this:
// Define the combo of bits to use only if BIT_0 is ON. #define MASK_0 (BIT_14 | BIT_16 | BIT_17 | \ BIT_22 | BIT_23 | BIT_25 | BIT_26) if ((BitMask & BIT_0) && (BitMask & MASK_0)) return 0;In case BIT_0 is OFF, MASK_0 must not be checked.
Therefore I cannot pack BIT_0 in MASK_0, or it could return even
when BIT_0 is OFF.
The real problem is that I need 17 of these if(), and it's not possible
to predict which of them will more likely return most of the times.
Also, there's a rare chance that none of them will return.
The complete list of if() tests is:
#define MASK_0 (0x06C34000UL) // Bits : 14 16 17 22 23 25 26. #define MASK_1 (0x07E3D000UL) // Bits : 12 14 15 16 17 21 22 23 24 25 26. #define MASK_2 (0x03619000UL) // Bits : 12 15 16 21 22 24 25. #define MASK_3 (0x06DB4C00UL) // Bits : 10 11 14 16 17 19 20 22 23 25 26. #define MASK_4 (0x07FFDE00UL) // Bits : 9 10 11 12 14 15 16 17 18 19 20 21 22 23 24 25 26. #define MASK_5 (0x036D9600UL) // Bits : 9 10 12 15 16 18 19 21 22 24 25. #define MASK_6 (0x00D84C00UL) // Bits : 10 11 14 19 20 22 23. #define MASK_7 (0x00FC5E00UL) // Bits : 9 10 11 12 14 18 19 20 21 22 23. #define MASK_8 (0x006C1600UL) // Bits : 9 10 12 18 19 21 22. #define MASK_9 (0x06C34000UL) // Bits : 14 16 17 22 23 25 26. #define MASK_10 (0x07E3D000UL) // Bits : 12 14 15 16 17 21 22 23 24 25 26. #define MASK_11 (0x03619000UL) // Bits : 12 15 16 21 22 24 25. #define MASK_12 (0x06DB4000UL) // Bits : 14 16 17 19 20 22 23 25 26. // MASK_13 doesn't exist. #define MASK_14 (0x036D4000UL) // Bits : 15 16 18 19 21 22 24 25. #define MASK_15 (0x00D40000UL) // Bits : 19 20 22 23. #define MASK_16 (0x00FC0000UL) // Bits : 18 19 20 21 22 23. #define MASK_17 (0x006C0000UL) // Bits : 18 19 21 22. if ((BitMask & BIT_0) && (BitMask & MASK_0)) return 0; if ((BitMask & BIT_1) && (BitMask & MASK_1)) return 0; if ((BitMask & BIT_2) && (BitMask & MASK_2)) return 0; if ((BitMask & BIT_3) && (BitMask & MASK_3)) return 0; if ((BitMask & BIT_4) && (BitMask & MASK_4)) return 0; if ((BitMask & BIT_5) && (BitMask & MASK_5)) return 0; if ((BitMask & BIT_6) && (BitMask & MASK_6)) return 0; if ((BitMask & BIT_7) && (BitMask & MASK_7)) return 0; if ((BitMask & BIT_8) && (BitMask & MASK_8)) return 0; if ((BitMask & BIT_9) && (BitMask & MASK_9)) return 0; if ((BitMask & BIT_10) && (BitMask & MASK_10)) return 0; if ((BitMask & BIT_11) && (BitMask & MASK_11)) return 0; if ((BitMask & BIT_12) && (BitMask & MASK_12)) return 0; // No check for BIT_13. If it was ON, we'd have returned already. if ((BitMask & BIT_14) && (BitMask & MASK_14)) return 0; if ((BitMask & BIT_15) && (BitMask & MASK_15)) return 0; if ((BitMask & BIT_16) && (BitMask & MASK_16)) return 0; if ((BitMask & BIT_17) && (BitMask & MASK_17)) return 0; return -1;I know: it's ugly.
I need to perform these 17 tests faster.
Any help will be most appreciated.
Thanks In Advance,
Ciao ciao : )












