Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

From: Andy Lutomirski
Date: Mon Feb 23 2015 - 18:45:23 EST


On Mon, Feb 23, 2015 at 2:27 PM, Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> wrote:
> On Mon, 23 Feb 2015, Rik van Riel wrote:
>
>> > I meant something else -- a slow FPU instruction can retire after a
>> > task has been switched where the FP context has been left intact,
>> > i.e. in the lazy FP context switching case, where only the MMU
>> > context and GPRs have been replaced.
>>
>> I don't think that's true, because changing the MMU context and GPRs
>> also includes changing the instruction pointer, and changing over the
>> execution to the new task.
>
> That does not matter. The instructions in question only operate on x87
> internal registers: the data stack registers, specifically ST(0) and
> possibly also ST(1), and consequently the Tag Word register, and the
> Status Word register. No CPU resource such as the MMU or GPRs need to be
> referred for an x87 instruction to complete. Any unmasked IEEE 754 FPU
> exception recorded on the way is only signalled at the next x87
> instruction.
>
>> After a context switch, the instructions from the old task are no
>> longer in the pipeline.
>
> I'd say it's implementation-specific. As I mentioned the i486 aborted
> any transcendental x87 instruction in progress upon taking an exception or
> interrupt. That was a model like you refer to, but as I also mentioned it
> had its shortcomings.

IRET is serializing, according to the the docs (I think) and according
to the Intel engineers I asked (I'm absolutely certain about this
part). So FPU ops are entirely done at the end of a normal context
switch.

We also always save the FPU context on every context switch away from
a task that used the FPU, even in lazy mode. This is because we might
switch the task back in on a different CPU, and we don't want to use
an IPI to move the FPU context.

Given that we're only talking about old CPUs here, I sincerely doubt
that there's any relevant case in which an fxsave can usefully wait
for a long-running transcendental op to finish while we continue doing
useful work. *Especially* since there will almost certainly be
several more mfences or atomic ops before the end of the context
switch, even if we're lucky enough to complete the context switching
using sysret.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/