Re: Crusoe's persistent translation on linux?

From: dean gaudet (dean-list-linux-kernel@arctic.org)
Date: Fri Jun 20 2003 - 12:46:46 EST


On Fri, 20 Jun 2003, John Bradford wrote:

> Would it be possible, (with relevant documentation), to tune the code
> morphing software for optimum performance of code generated by a
> specific compiler, though?
>
> If a particular version of GCC favours certain constructs and uses
> particular sets of registers for a given piece of code, couldn't we
> optimise for those cases, at the expense of others? Maybe a
> particular compiler doesn't use certain X86 instructions at all, and
> these could be eliminated altogether?

very little of the translation overhead is involed with deocding the x86
instructions... most of the translation overhead is in scheduling and
optimising for the VLIW.

there are some tricks which you can apply at the x86 level which favour
CMS much more than they favour other processors... specifically one
related to what jeff brought up:

On Fri, 20 Jun 2003, Jeff Garzik wrote:

> Newer CPUs do register renaming in an attempt to avoid the
> register-starved ISA issue. I presume Xmeta would do something
> similar...

yeah CMS does this internally.

one way you can exploit this for performance is within a basic block (or
within a code path that is most likely to be executed with a handful of
rarely/never taken branch outs) you can express every sub-expression
completely without worrying about its schedule -- which gives you access
to all the source x86 registers. CMS will reschedule it to fit onto the
pipeline for you, and rename to internal registers.

this can be a huge help for floating point code when you also unroll the
code. for example if you are doing a polynomial expansion, you can simply
write this:

        f0 = a0 + x0*(a1 + x0*(a2 + x0*(a3 + x0*a4)));
        f1 = b0 + x1*(a1 + x1*(a2 + x1*(a3 + x1*a4)));

you don't need to schedule the x87 code at all -- CMS will do it for you.
this example is pretty trivial, but if you have sequences which overflow
the x87 register set and require stack operations when scheduled for
a typical x86 processor you can mostly avoid the stack operations when
"scheduling" for CMS.

think of it as a huge reorder buffer.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 23 2003 - 22:00:33 EST