Re: [PATCH 1/4] x86/fpu: Clear XSTATE_BV[i] in save state whenever XFD[i]=1

From: Dave Hansen

Date: Thu Jan 15 2026 - 13:19:08 EST


On 1/15/26 08:22, Paolo Bonzini wrote:
>
>   Guest running with MSR_IA32_XFD = 0
>     WRMSR(MSR_IA32_XFD)
>     vmexit
>   Host:
>     enable IRQ
>     interrupt handler
>       kernel_fpu_begin() -> sets TIF_NEED_FPU_LOAD
>         XSAVE -> stores XINUSE[18] = 1
>         ...
>       kernel_fpu_end()
>     handle vmexit
>       fpu_update_guest_xfd() -> XFD[18] = 1
>     reenter guest
>       fpu_swap_kvm_fpstate()
>         XRSTOR -> XINUSE[18] = 1 && XFD[18] = 1 -> #NM and boom
>
> With the patch, fpu_update_guest_xfd() sees TIF_NEED_FPU_LOAD set and
> clears the bit from xinuse.

Paolo, thanks for clarifying that!

Abbreviated, that's just:

XFD[18]=0
...
# Interrupt (that does XSAVE)
XFD[18]=1
XRSTOR => #NM

Is there anything preventing the kernel_fpu_begin() interrupt from
happening a little later, say:

XFD[18]=0
...
XFD[18]=1
# Interrupt (that does XSAVE)
XRSTOR (no #NM)

In that case, the XSAVE in kernel_fpu_begin() "operates as if XINUSE[i]
= 0" and would set XFEATURES[18]=0; it would save the component as being
in its init state. The later XRSTOR would obviously restore state 18 to
its init state.

Without involving SMIs, I think it lands feature 18 in its init state as
well. The state is _already_ being destroyed in the existing code
without anything exotic needing to happen.

That's a long-winded way of saying I think I agree with the patch. It
destroys the state a bit more aggressively but it doesn't do anything _new_.

What would folks think about making the SDM language stronger, or at
least explicitly adding the language that setting XFD[i]=1 can lead to
XINUSE[i] going from 1=>0. Kinda like the language that's already in
"XRSTOR and the Init and Modified Optimizations", but specific to XFD:

If XFD[i] = 1 and XINUSE[i] = 1, state component i may be
tracked as init; XINUSE[i] may be set to 0.

That would make it consistent with the KVM behavior. It might also give
the CPU folks some additional wiggle room for new behavior.