[PATCH v3 10/10] x86/fpu/xstate: Restore supervisor states for signal return

From: Yu-cheng Yu
Date: Sat Mar 28 2020 - 12:44:08 EST


As described in the previous patch, the signal return fast path directly
restores user states from the user buffer. Once that succeeds, restore
supervisor states (but only when they are not yet restored).

For the slow path, save supervisor states to preserve them across context
switches, and restore after the user states are restored.

The previous version has the overhead of an XSAVES in both the fast and the
slow paths. It is addressed as the following:

- In the fast path, only do an XRSTORS.
- In the slow path, do a supervisor-state-only XSAVES, and relocate the
buffer contents.

Some thoughts in the implementation:

- In the slow path, can any supervisor state become stale between
save/restore?

Answer: set_thread_flag(TIF_NEED_FPU_LOAD) protects the xstate buffer.

- In the slow path, can any code reference a stale supervisor state
register between save/restore?

Answer: In the current lazy-restore scheme, any reference to xstate
registers needs fpregs_lock()/fpregs_unlock() and __fpregs_load_activate().

- Are there other options?

One other option is eagerly restoring all supervisor states.

Currently, CET user-mode states and ENQCMD's PASID do not need to be
eagerly restored. The upcoming CET kernel-mode states (24 bytes) need
to be eagerly restored. To me, eagerly restoring all supervisor states
adds more overhead then benefit at this point.

v3:
- Change copy_xregs_to_kernel() to copy_supervisor_to_kernel(), which is
introduced in a previous patch.
- Update commit log.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
Reviewed-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
---
arch/x86/kernel/fpu/signal.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 545ca4314096..4dad5afc938d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -347,6 +347,11 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
pagefault_enable();
if (!ret) {
+ /* Restore supervisor states */
+ if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
+ xfeatures_mask_supervisor())
+ copy_kernel_to_xregs(&fpu->state.xsave,
+ xfeatures_mask_supervisor());
fpregs_mark_activate();
fpregs_unlock();
return 0;
@@ -364,14 +369,21 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
}

/*
- * The current state of the FPU registers does not matter. By setting
- * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
- * is not modified on context switch and that the xstate is considered
+ * Supervisor states are not modified by user space input. Save
+ * current supervisor states first.
+ * By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
+ * not modified on context switch and that the xstate is considered
* to be loaded again on return to userland (overriding last_cpu avoids
* the optimisation).
*/
- set_thread_flag(TIF_NEED_FPU_LOAD);
+ fpregs_lock();
+ if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+ if (xfeatures_mask_supervisor())
+ copy_supervisor_to_kernel(&fpu->state.xsave);
+ set_thread_flag(TIF_NEED_FPU_LOAD);
+ }
__fpu_invalidate_fpregs_state(fpu);
+ fpregs_unlock();

if (use_xsave() && !fx_only) {
u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
@@ -393,7 +405,12 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
fpregs_lock();
if (unlikely(init_bv))
copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
- ret = copy_kernel_to_xregs_err(&fpu->state.xsave, user_xfeatures);
+ /*
+ * Restore previously saved supervisor xstates along with
+ * copied-in user xstates.
+ */
+ ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
+ user_xfeatures | xfeatures_mask_supervisor());

} else if (use_fxsr()) {
ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
--
2.21.0