Re: Semaphore assembly-code bug

From: Linus Torvalds
Date: Mon Nov 01 2004 - 22:24:48 EST




On Mon, 1 Nov 2004, Linus Torvalds wrote:
>
> So can you _please_ just admit that you were wrong? On a P4, the pop/pop
> is the same cost as lea/pop, and on a Pentium M the pop/pop is faster,
> according to this test. Your contention that "pop" has to be slower than
> "lea" is WRONG.

Btw, I'd like to emphasize "this test". Modern OoO CPU's are complex
animals. They have pipeline quirks etc that just means that things depend
on alignment, on code around it, and on register usage patterns of the
instructions that you test _and_ the instructions around those
instructions. So take any proof with a pinch of salt, because there are
bound to be other circumstances where factors around the code just change
the assumptions.

In short, any time you're looking at single cycle timings, you should be
very aware of the fact that your measurements are suspect. The best way to
avoid most of the problem is to never try to measure single cycles.
Measure performance on a program, not on a single instruction.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/