Re: [patch] entry.S asm improvement (removed some ugly jmp)

Linus Torvalds (torvalds@transmeta.com)
Fri, 27 Nov 1998 23:38:04 -0800 (PST)


On Sat, 28 Nov 1998, Andrea Arcangeli wrote:
>
> But switch_to()/__switch_to() (include/asm-i386/system.h) uses an impair
> number of call/ret. Why? Am I missing something (maybe because it' s too
> late ;).

When you actually end up doing a context switch, the return predictor is
hosed anyway - there's no way it can know what is going on, as the stack
is getting switched from underneath it.

However, the true reason is simply that I'm not consistent either. I've
certainly been known to write code that kills call/return predictors, and
I don't have anything fundamental against them. I only reacted to your
"obviously correct and faster" comment, when it really really isn't that
obvious at all.

Branch prediction is a big deal for modern CPU's, where a misspredicted
branch on a PII easily takes 15+ cycles. Normal call-return sequences tend
to be very predictable indeed if the CPU has a return stack and nobody is
playing mind-games, and as such it usually pays to try to make it easy for
hardware.

However, your patches have the potential to sometimes not needing a icache
entry for the return path, which is good. The point being that it is not
at all obvious which code is actually the faster one.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/