Re: [PATCH RFC 3/3] x86/fpu: defer FPU state load until return to userspace

From: Andy Lutomirski
Date: Mon Oct 17 2016 - 16:58:39 EST


On Mon, Oct 17, 2016 at 1:09 PM, <riel@xxxxxxxxxx> wrote:
> From: Rik van Riel <riel@xxxxxxxxxx>
>
> Defer loading of FPU state until return to userspace. This gives
> the kernel the potential to skip loading FPU state for tasks that
> stay in kernel mode, or for tasks that end up with repeated
> invocations of kernel_fpu_begin.
>
> It also increases the chances that a task's FPU state will remain
> valid in the FPU registers until it is scheduled back in, allowing
> us to skip restoring that task's FPU state altogether.
>
> This also prepares the ground work for not having to restore
> qemu userspace FPU state in KVM VCPU threads, when merely returning
> to the host kernel because the guest went idle, or is running a
> kernel thread. That functionality will come in a later patch.
>
> Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
> ---
> arch/x86/entry/common.c | 9 +++++++++
> arch/x86/include/asm/fpu/api.h | 5 +++++
> arch/x86/include/asm/fpu/internal.h | 13 +++++--------
> arch/x86/include/asm/thread_info.h | 4 +++-
> arch/x86/kernel/fpu/core.c | 28 ++++++++++++++++++++++++----
> arch/x86/kernel/process_32.c | 5 ++---
> arch/x86/kernel/process_64.c | 5 ++---
> 7 files changed, 50 insertions(+), 19 deletions(-)
>
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index bdd9cc59d20f..0c11ee22f90b 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -27,6 +27,7 @@
> #include <asm/vdso.h>
> #include <asm/uaccess.h>
> #include <asm/cpufeature.h>
> +#include <asm/fpu/api.h>
>
> #define CREATE_TRACE_POINTS
> #include <trace/events/syscalls.h>
> @@ -189,6 +190,14 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
> if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
> exit_to_usermode_loop(regs, cached_flags);
>
> + /* Reload ti->flags; we may have rescheduled above. */
> + cached_flags = READ_ONCE(ti->flags);

Stick this bit in the "if" above, please.

> +
> + if (unlikely(cached_flags & _TIF_LOAD_FPU)) {
> + clear_thread_flag(TIF_LOAD_FPU);
> + switch_fpu_return();
> + }
> +

But I still don't see how this can work correctly with PKRU.

--Andy