tribe server 1.11 vs 1.30/1.40

Another reason to abandon 1.30 is some new tribes technology which has finally made it out of the Unhelpful labs.

You see Tribes 1.11 was compiled with the Borland 5.x compiler. It generates 387 floating point instructions rather than the newer SSE floating point which is much faster.

While we can't recompile Tribes YET to generate the SSE insructions we can look at the Tribes 1.11 image and find places where "peephole optimization" was not done. For those that know x86 assembler here is an example of what is in the image:

:00404C97 D99B9C000000 fstp 32real[ebx+0000009C]
:00404C9D D9839C000000 fld 32real[ebx+0000009C]

So the compiler is storing and erasing a result, and then loading the result back in. By altering the fstp to an fst and then nop'ing out the second instruction many cycles can be recovered. This occurs in several spots so the win is large enough to be of interest. Some of these are inside loops which makes it even more delicious. On Northwood and newer CPU's the nop's are dropped out in the pipeline so they cost even less. BOOYAH!

We will be changing the Tribes exe on the United States Base server to see how much of a performance improvement we will get. The win here affects both server and client.

Unfortunately lasthope will interfere with the "peephole optimized" tribes because the memory crc's will be different.

There may be other optimizations that can be made to the Tribes.exe even without the source code that yield enough of a performance improvement to warrant abandoning lasthope.

have played today and felt the difference

x86 level optimization is pretty insane

bugs and unhelpful are really smart people
 
Back
Top