[PATCH v2] x86/fpu: Don't cache access to fpu_fpregs_owner_ctx

From: Sebastian Andrzej Siewior
Date: Thu Nov 28 2019 - 03:53:35 EST


The state/owner of FPU is saved fpu_fpregs_owner_ctx by pointing to the
context that is currently loaded. It never changed during the life time
of a task and remained stable/constant.

Since deferred loading the FPU registers on return to userland, the
content of fpu_fpregs_owner_ctx may change during preemption and must
not be cached.
This went unnoticed for some time and was now noticed, in particular
gcc-9 is able to cache that load in copy_fpstate_to_sigframe() and reuse
it in the retry loop:

copy_fpstate_to_sigframe()
load fpu_fpregs_owner_ctx and save on stack
fpregs_lock()
copy_fpregs_to_sigframe() /* failed */
fpregs_unlock()
*** PREEMPTION, another uses FPU, changes fpu_fpregs_owner_ctx ***

fault_in_pages_writeable() /* succeed, retry */

fpregs_lock()
__fpregs_load_activate()
fpregs_state_valid() /* uses fpu_fpregs_owner_ctx from stack */
copy_fpregs_to_sigframe() /* succeeds, random FPU content */

This is a comparison of the assembly of gcc-9, without vs with this
patch:

| # arch/x86/kernel/fpu/signal.c:173: if (!access_ok(buf, size))
| cmpq %rdx, %rax # tmp183, _4
| jb .L190 #,
|-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|-#APP
|-# 512 "arch/x86/include/asm/fpu/internal.h" 1
|- movq %gs:fpu_fpregs_owner_ctx,%rax #, pfo_ret__
|-# 0 "" 2
|-#NO_APP
|- movq %rax, -88(%rbp) # pfo_ret__, %sfp
â
|-# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|- movq -88(%rbp), %rcx # %sfp, pfo_ret__
|- cmpq %rcx, -64(%rbp) # pfo_ret__, %sfp
|+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|+#APP
|+# 512 "arch/x86/include/asm/fpu/internal.h" 1
|+ movq %gs:fpu_fpregs_owner_ctx(%rip),%rax # fpu_fpregs_owner_ctx, pfo_ret__
|+# 0 "" 2
|+# arch/x86/include/asm/fpu/internal.h:512: return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
|+#NO_APP
|+ cmpq %rax, -64(%rbp) # pfo_ret__, %sfp

Use this_cpu_read() instead this_cpu_read_stable() to avoid caching of
fpu_fpregs_owner_ctx during preemption points.

The fixes tag points to the commit where defered FPU loading was added.
Since this commit the compiler is no longer allowed move the load of
fpu_fpregs_owner_ctx somewhere else / outside of the locked section. A
task preemption will change its value and stale content will be observed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205663
Fixes: 5f409e20b7945 ("x86/fpu: Defer FPU state load until return to userspace")
Debugged-by: Austin Clements <austin@xxxxxxxxxx>
Debugged-by: David Chase <drchase@xxxxxxxxxx>
Debugged-by: Ian Lance Taylor <ian@xxxxxxxx>
Tested-by: Borislav Petkov <bp@xxxxxxx>
Reviewed-by: Rik van Riel <riel@xxxxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---

v1âv2:
- Adding tags
- Explaining why Fixes: does not point to the bisected commit.

arch/x86/include/asm/fpu/internal.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 4c95c365058aa..44c48e34d7994 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -509,7 +509,7 @@ static inline void __fpu_invalidate_fpregs_state(struct fpu *fpu)

static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
{
- return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
+ return fpu == this_cpu_read(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
}

/*
--
2.24.0