Re: ldl_u() used for aligned access

Richard Henderson (richard@atheist.tamu.edu)
Mon, 11 Nov 1996 15:03:17 -0600 (CST)


> what's the best way to have a ldl_u() routine which does
> use the special asm commands only when the argumet is really unaligned

There is no good way to do this.

> since using the unaligned access commands takes much more time
> than a normal accesss, so the additional test shouldn't be too bad
> and it would give a big performace win for aligned accesses.

False. Branch predict miss penalties are a bear -- 3 cycles on the ev4
and 5 cycles on the ev5:

# issue cycle numbers
# ev4 ev5
# branching fallthru branching fallthru
and addr,7,r1 # 0 0
beq r1,1f # 1 0
ldq_u r1,0(addr) # 2 0 1 0
ldq_u r2,7(addr) # 3 1 1 0
extql r1,addr,r1 # 5 3 3 2
extqh r2,addr,r2 # 6 4 4 3
or r1,r2,r1 # 9 7 5 4
br 2f # 10 5
1: ldq r1,0(addr) # 4 5
2: /* use r1 */ # 11 7 8 6 7 5

(Assumes data is in L1 cache, otherwise load delays dominate and the
whole issue is moot.)

So on the ev4 fallthru is 1 cycle slower for aligned accesses --
hardly much more time -- and on the ev5 fallthru always wins.

r~