Re: [RFC PATCH] x86, fpu: Use eagerfpu by default on all CPUs

From: Maciej W. Rozycki
Date: Mon Feb 23 2015 - 21:14:17 EST


On Mon, 23 Feb 2015, Andy Lutomirski wrote:

> >> After a context switch, the instructions from the old task are no
> >> longer in the pipeline.
> >
> > I'd say it's implementation-specific. As I mentioned the i486 aborted
> > any transcendental x87 instruction in progress upon taking an exception or
> > interrupt. That was a model like you refer to, but as I also mentioned it
> > had its shortcomings.
>
> IRET is serializing, according to the the docs (I think) and according
> to the Intel engineers I asked (I'm absolutely certain about this
> part). So FPU ops are entirely done at the end of a normal context
> switch.

No question about the serialising property of IRET, it has been like this
since the original Pentium implementation. Do you have an architecture
specification reference to back up your claim though as far as the FPU is
concerned? I'm asking because I am genuinely curious.

The x87 case is so special, there isn't anything there really that is
externally observable or should be affected by IRET or any other
synchronisation barriers apart from WAIT (or a waiting x87 instruction)
that has been there for this purpose since forever. And it would defeat
some documented benefits of running the FP pipeline in the parallel.

And certainly such synchronisation didn't happen in the old days.

> We also always save the FPU context on every context switch away from
> a task that used the FPU, even in lazy mode. This is because we might
> switch the task back in on a different CPU, and we don't want to use
> an IPI to move the FPU context.

That's an interesting case too, although not necessarily related. If you
say that we always save the FP context eagerly for the purpose of process
migration, then sure, that invalidates any benefit we'd have from letting
the x87 proceed.

However I can see different ways to address this case avoiding the need
of eager FP context saving or an IPI:

1. We could bind any currently suspended process with an unsaved FP
context to the CPU it last executed on.

2. We could mark such a process for migration next time and let it execute
on the CPU that holds its FP context once more, and then save the FP
context eagerly on the way out.

In some cases a lazily retained FP context would be preempted by another
process before the process in question would resume anyway. In this case
any temporary binding to a CPU could be given up.

> Given that we're only talking about old CPUs here, I sincerely doubt
> that there's any relevant case in which an fxsave can usefully wait
> for a long-running transcendental op to finish while we continue doing
> useful work. *Especially* since there will almost certainly be
> several more mfences or atomic ops before the end of the context
> switch, even if we're lucky enough to complete the context switching
> using sysret.

I am not sure what you mean by FXSAVE usefully waiting for an op, please
elaborate. At the point you've reached FXSAVE and an earlier x87
instruction hasn't completed, you've already lost. The pipeline will be
stalled until the x87 instruction has completed and it can be hundreds of
cycles. My point therefore has been about avoiding to execute FXSAVE for
the old task until absolutely necessary, that with the lazy FP context
switching would be at the next x87 (or SSE) instruction reached by the new
task.

Likewise I don't see why MFENCE or an atomic operation should affect the
excecution of say FSINCOS. Whether the results of FSINCOS arrive before
or after MFENCE, etc. are not externally observable.

And I'm not sure if this all affects old CPUs only -- I don't know how
much x87 software is out there, but after all these years I'd expect quite
some. Sure, lots of this can be recompiled to use SSE instead, but not
all, and even where it is feasible, that's an extra burden for people,
beyond say a routine hardware or Linux distribution or for that matter
lone kernel upgrade. Therefore I think we need to be careful not to
pessimise things for a subset of people too much and ideally at all.

And to be clear, I am not against removing lazy FP context switching per
se. I am just emphasizing to be careful with that and be absolutely sure
that it does not cause excessive harm.

I still wonder why Intel hasn't addressed some issues around this stuff
-- is that there are not enough people using proper IEEE 754 arithmetic on
x86 hardware to attract interest of hardware architecture maintainers?
After all the same issue applies to enabled IEEE 754 exceptions, a #MF/#XM
exception isn't going to take any less than a #NM fault. Or maybe I'm
just missing something?

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/