Re: Semaphore assembly-code bug

From: Linus Torvalds
Date: Fri Oct 29 2004 - 13:20:05 EST




On Fri, 29 Oct 2004, linux-os wrote:
> > with the following:
> >
> > leal 4(%esp),%esp
>
> Probably so because I'm pretty certain that the 'pop' (a memory
> access) is not going to be faster than a simple register operation.

Bzzt, wrong answer.

It's not "simple register operation". It's really about the fact that
modern CPU's are smarter - yet dumber - then you think. They do things
like speculate the value of %esp in order to avoid having to calculate it,
and it's entirely possible that "pop" is much faster, simply because I
guarantee you that a CPU will speculate %esp correctly across a "pop", but
the same is not necessarily true for "lea %esp".

Somebody should check what the Pentium M does. It might just notice that
"lea 4(%esp),%esp" is the same as "add 4 to esp", but it's entirely
possible that lea will confuse its stack engine logic and cause
stack-related address generation stalls..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/