Re: [patch 3/8] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer

From: Andy Lutomirski
Date: Thu Jun 03 2021 - 13:30:09 EST


On 6/2/21 8:06 AM, Borislav Petkov wrote:
> On Wed, Jun 02, 2021 at 11:55:46AM +0200, Thomas Gleixner wrote:
>> From: Andy Lutomirski <luto@xxxxxxxxxx>
>>
>> If XRSTOR fails due to sufficiently complicated paging errors (e.g.
>> concurrent TLB invalidation),
>
> I can't connect "concurrent TLB invalidation" to "sufficiently
> complicated paging errors". Can you elaborate pls?

Think "complex microarchitectural conditions".

How about:

As far as I can tell, both Intel and AMD consider it to be
architecturally valid for XRSTOR to fail with #PF but nonetheless change
user state. The actual conditions under which this might occur are
unclear [1], but it seems plausible that this might be triggered if one
sibling thread unmaps a page and invalidates the shared TLB while
another sibling thread is executing XRSTOR on the page in question.

__fpu__restore_sig() can execute XRSTOR while the hardware registers are
preserved on behalf of a different victim task (using the
fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
modify the registers. If this happens, then there is a window in which
__fpu__restore_sig() could schedule out and the victim task could
schedule back in without reloading its own FPU registers. This would
result in part of the FPU state that __fpu__restore_sig() was attempting
to load leaking into the victim task's user-visible state.

Invalidate preserved FPU registers on XRSTOR failure to prevent this
situation from corrupting any state.

[1] Frequent readers of the errata lists might imagine "complex
microarchitectural conditions".

>> + * failed. In the event that the ucode was
>> + * unfriendly and modified the registers at all, we
>> + * need to make sure that we aren't corrupting an
>> + * innocent non-current task's registers.
>> + */
>> + __cpu_invalidate_fpregs_state();
>> + } else {
>> + /*
>> + * As above, we may have just clobbered current's
>> + * user FPU state. We will either successfully
>> + * load it or clear it below, so no action is
>> + * required here.
>> + */
>> + }
>
> I'm wondering if that comment can simply be above the TIF_NEED_FPU_LOAD
> testing, standalone, instead of having it in an empty else? And then get
> rid of that else.

I'm fine either way.