Re: [Patch v5 06/19] perf/x86: Add support for XMM registers in non-PEBS and REGS_USER

From: Dave Hansen

Date: Thu Dec 04 2025 - 13:59:17 EST


On 12/4/25 07:17, Peter Zijlstra wrote:
>> - Additionally, checking the TIF_NEED_FPU_LOAD flag alone is insufficient.
>> Some corner cases, such as an NMI occurring just after the flag switches
>> but still in kernel mode, cannot be handled.
> Urgh.. Dave, Thomas, is there any reason we could not set
> TIF_NEED_FPU_LOAD *after* doing the XSAVE (clearing is already done
> after restore).
>
> That way, when an NMI sees TIF_NEED_FPU_LOAD it knows the task copy is
> consistent.

Something like the attached patch?

I think that would be just fine. save_fpregs_to_fpstate() doesn't
actually change the need for TIF_NEED_FPU_LOAD, so I don't think the
ordering matters.
diff --git a/arch/x86/include/asm/fpu/sched.h b/arch/x86/include/asm/fpu/sched.h
index 89004f4ca208..2d57a7bf5406 100644
--- a/arch/x86/include/asm/fpu/sched.h
+++ b/arch/x86/include/asm/fpu/sched.h
@@ -36,8 +36,8 @@ static inline void switch_fpu(struct task_struct *old, int cpu)
!(old->flags & (PF_KTHREAD | PF_USER_WORKER))) {
struct fpu *old_fpu = x86_task_fpu(old);

- set_tsk_thread_flag(old, TIF_NEED_FPU_LOAD);
save_fpregs_to_fpstate(old_fpu);
+ set_tsk_thread_flag(old, TIF_NEED_FPU_LOAD);
/*
* The save operation preserved register state, so the
* fpu_fpregs_owner_ctx is still @old_fpu. Store the