Re: [rfc patch] i386: don't save eflags on task switch

From: Zachary Amsden
Date: Sat Nov 04 2006 - 14:10:00 EST


Chuck Ebbert wrote:
In-Reply-To: <Pine.LNX.4.64.0611031645141.25218@xxxxxxxxxxx>

On Fri, 3 Nov 2006 16:46:25 -0800, Linus Torvalds wrote:

On Fri, 3 Nov 2006, Chuck Ebbert wrote:
There is no real need to save eflags in switch_to(). Instead,
we can keep a constant value in the thread_struct and always
restore that.
I don't really see the point. The "pushfl" isn't the expensive part, and it gives sane and expected semantics.

The "popfl" is the expensive part, and that's the thing that can't really even be removed.

Well that wasn't the impression I got:

Date: Mon, 18 Sep 2006 12:12:51 -0400
From: Benjamin LaHaise <bcrl@xxxxxxxxx>
Subject: Re: Sysenter crash with Nested Task Bit set

...

It's the pushfl that will be slow on any OoO CPU, as it has dependancies on any previous instructions that modified the flags, which ends up bringing all of the memory ordering dependancies into play. Doing a popfl to set the flags to some known value is much less expensive.

That doesn't sound correct to me. The popf is far more expensive. There is no popfl $IMM instruction, so setting flags never can avoid the memory read and must make some more expensive assumptions about effects on further instruction stream (TF, DF, all sign flags for conditional jumps...).

Every processor I've ever measured it on, popf is slower. On P4, for example, pushf is 6 cycles, and popf is 54. On Opteron, it is 2 / 12. On Xeon, it is 7 / 91.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/