I know this is an old hat, but it keeps biting people...
--david
PS: Not to give the wrong impression: the optimization looks *very* promising
and I think it's well worth doing. Even if you can't get rid of the load of
$27, it still is preferable to use a bsr instruction since it won't have a
data dependency on the load of $27 and since it is always predicted correctly.
A final note: OSF/1's linker does this kind of optimizations when using the
-om flag (and possibly others, I forget). It may be worthwhile to play with
that a little to get a feeling of what would have the biggest payoff.