Page 1 of 1

Performance of combining local integers

Posted: Sun Jul 23, 2006 9:07 am
by Ronan
I performed some tests on the speed of combining more than one number in a local int vs. putting them all in their own seperat ints. Results varied, but I think a performance increase of at least 22% is likely with 4 8-bit numbers, increasing as the number of locals stored on one object does. The code executes faster even without counting the increased speed of a GetLocalInt() call when the number of local variables are reduced.

The 22% gain was seen comparing 10 GetLocalInt() calls on an object with 10 local integers to 4 GetLocalInt() calls, 10 GetPiecewiseInteger() calls on an object with 4 local integers. 22% was the gain over 100,000 iterations.

Normal code:

Code: Select all

GetLocalInt(OBJECT_SELF, "V0");
GetLocalInt(OBJECT_SELF, "V1");
GetLocalInt(OBJECT_SELF, "V2");
GetLocalInt(OBJECT_SELF, "V3");
GetLocalInt(OBJECT_SELF, "V4");
GetLocalInt(OBJECT_SELF, "V5");
GetLocalInt(OBJECT_SELF, "V6");
GetLocalInt(OBJECT_SELF, "V7");
GetLocalInt(OBJECT_SELF, "V8");
GetLocalInt(OBJECT_SELF, "V9");
...
Piecewise code:

Code: Select all

v0 = GetLocalInt(OBJECT_SELF, "V0");
v1 = GetLocalInt(OBJECT_SELF, "V1");
v2 = GetLocalInt(OBJECT_SELF, "V2");
GetPiecewiseInteger(v0, 0, 7);
GetPiecewiseInteger(v0, 8, 15);
GetPiecewiseInteger(v0, 16, 23);
GetPiecewiseInteger(v0, 24, 31);
GetPiecewiseInteger(v1, 0, 7);
GetPiecewiseInteger(v1, 8, 15);
GetPiecewiseInteger(v1, 16, 23);
GetPiecewiseInteger(v1, 24, 31);
GetPiecewiseInteger(v2, 0, 7);
GetPiecewiseInteger(v2, 8, 15);
...
To get these piecewise numbers, I use this function in acr_tools_i:

Code: Select all

int GetPiecewiseInteger(int nNum, int nStartBit, int nEndBit) {
    nNum = nNum << nEndBit;
    return ( nNum >>> (nStartBit + nEndBit) );
}
So this is something which could be usefull when memory use is important, or we are storing a lot of locals on one object, or both.

Posted: Sun Jul 23, 2006 7:47 pm
by ç i p h é r
Interesting but will we ever perform 100,000 iterations in sequence? Is the gain tied to iterations or is it actually a fixed % difference between the methods?

I've been thinking about efficiency as well but if we're only squeezing out marginal gains, I'd favor the simplicity and convenience of dedicated integers. I do think though that each system should store its own flags (booleans) as bits on a single integer. That's perfectly intuitive (you don't have to worry about data range, overflows, etc) and will save memory.

Posted: Sun Jul 23, 2006 9:37 pm
by Ronan
ç i p h é r wrote:Interesting but will we ever perform 100,000 iterations in sequence? Is the gain tied to iterations or is it actually a fixed % difference between the methods?
Seems a fixed percentage. I'd really only planned to use this sort of thing on cSkins and spawn points, though after this test maybe just cSkins make sense.