Re: [PATCH, 3.18] sleeping function called from invalid context

From: Rik van Riel
Date: Wed Dec 10 2014 - 21:47:56 EST

Hash: SHA1

On 12/10/2014 08:53 PM, Andy Lutomirski wrote:
> On Wed, Dec 10, 2014 at 5:32 PM, Rik van Riel <riel@xxxxxxxxxx>
> wrote:
>> On 12/10/2014 07:51 PM, Andy Lutomirski wrote:
>>> On Wed, Dec 10, 2014 at 4:49 PM, Rik van Riel
>>> <riel@xxxxxxxxxx> wrote: On 12/10/2014 07:46 PM, Daniel J
>>> Blueman wrote:
>>>>>> Gah. I had some non-temporal copy changes in the wrong
>>>>>> tree. I'll check with a definitely clean tree and follow
>>>>>> up if it still occurs.
>>> The exception handlers should definitely allow sleeping, so I
>>> suspect those changes may be related.
>>>> It would be really, really nice if we could arrange for
>>>> kernel_fpu_begin to be unconditionally usable in anything
>>>> except NMI context. The crypto code would be much less
>>>> scary, we could make non-temporal copies safe, etc. Can we
>>>> have ponies, too?
>> Isn't it already?
>> I see nothing in __kernel_fpu_begin that looks like it would ever
>> need to sleep.
> It never needs to sleep, but it does need somewhere to save the
> previous state. See irq_fpu_usable.
> FWIW, I don't understand what the comments above
> interrupted_kernel_fpu_idle are talking about. The issue that I
> understand is:
> kernel_fpu_begin()
> irq: kernel_fpu_begin() use xstate kernel_fpu_end()
> we're screwed now :(
> kernel_fpu_end()
> IOW we need somewhere to put the state from the thing we
> interrupted.

Good point. An interruptible kernel_fpu_begin needs to provide a
place to put the state. Alternatively, the one that runs from irq
context could provide the place to store the current context, with
a kernel_fpu_begin_irq() function...

> This gets extra fun if some thread does something that takes a
> page fault that uses fpu that gets interrupted, etc. Fortunately,
> I think that can't happen -- kernel_fpu_begin disables preemption.
> So I think we have a maximum of one active FPU context per thread
> plus some number per cpu. Maybe we could have a percpu array of
> ten or twenty xstates to handle all possible nesting.
> Also, can we just delete the non-eager code some day?

The XSAVEOPT and XRSTOR optimizations do not work across VMENTER
and VMEXIT, so with the eager code we end up always loading and
saving 384 bytes of state at every context switch, even for tasks
that never once touched the FPU.

That is 6 cache lines worth of unused stuff for each task
involved in the context switch. Somehow I am not convinced that
is a good idea...

- --
All rights reversed
Version: GnuPG v1

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at