Re: [PATCH V8 15/44] x86/pkeys: Preserve the PKS MSR on context switch

From: Dave Hansen
Date: Fri Jan 28 2022 - 19:22:47 EST


On 1/27/22 09:54, ira.weiny@xxxxxxxxx wrote:
> From: Ira Weiny <ira.weiny@xxxxxxxxx>
>
> The PKS MSR (PKRS) is defined as a per-logical-processor register. This

s/defined as//

> isolates memory access by logical CPU.

This second sentence is a bit confusing to me. I *think* you're trying
to say that PKRS only affects accesses from one logical CPU. But, it
just comes out strangely. I think I'd just axe the sentence.

> Unfortunately, the MSR is not
> managed by XSAVE. Therefore, tasks must save/restore the MSR value on
> context switch.
>
> Define pks_saved_pkrs in struct thread_struct. Initialize all tasks,
> including the init_task, with the PKS_INIT_VALUE when created. Restore
> the CPU's MSR to the saved task value on schedule in.
>
> pks_write_current() is added to ensures non-supervisor pkey

^ ensure

...
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 2c5f12ae7d04..3530a0e50b4f 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -2,6 +2,8 @@
> #ifndef _ASM_X86_PROCESSOR_H
> #define _ASM_X86_PROCESSOR_H
>
> +#include <linux/pks-keys.h>
> +
> #include <asm/processor-flags.h>
>
> /* Forward declaration, a strange C thing */
> @@ -502,6 +504,12 @@ struct thread_struct {
> unsigned long cr2;
> unsigned long trap_nr;
> unsigned long error_code;
> +
> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> + /* Saved Protection key register for supervisor mappings */
> + u32 pks_saved_pkrs;
> +#endif

There are a bunch of other "saved" registers in thread_struct. They all
just have their register name, including pkru.

Can you just stick this next to 'pkru' and call it plain old 'pkrs'?
That will probably even take up less space than this since the two
32-bit values can be packed together.

> #ifdef CONFIG_VM86
> /* Virtual 86 mode info */
> struct vm86 *vm86;
> @@ -769,7 +777,14 @@ static inline void spin_lock_prefetch(const void *x)
> #define KSTK_ESP(task) (task_pt_regs(task)->sp)
>
> #else
> -#define INIT_THREAD { }
> +
> +#ifdef CONFIG_ARCH_ENABLE_SUPERVISOR_PKEYS
> +#define INIT_THREAD { \
> + .pks_saved_pkrs = PKS_INIT_VALUE, \
> +}
> +#else
> +#define INIT_THREAD { }
> +#endif
>
> extern unsigned long KSTK_ESP(struct task_struct *task);
>
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 3402edec236c..81fc0b638308 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -59,6 +59,7 @@
> /* Not included via unistd.h */
> #include <asm/unistd_32_ia32.h>
> #endif
> +#include <asm/pks.h>
>
> #include "process.h"
>
> @@ -657,6 +658,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> /* Load the Intel cache allocation PQR MSR. */
> resctrl_sched_in();
>
> + pks_write_current();
> +
> return prev_p;
> }

At least for pkru and fsgsbase, these have the form:

x86_<register>_load();

Should this be

x86_pkrs_load();

and be located next to:

x86_pkru_load()?

> diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
> index 3dce99ef4127..6d94dfc9a219 100644
> --- a/arch/x86/mm/pkeys.c
> +++ b/arch/x86/mm/pkeys.c
> @@ -242,6 +242,19 @@ static inline void pks_write_pkrs(u32 new_pkrs)
> }
> }
>
> +/**
> + * pks_write_current() - Write the current thread's saved PKRS value
> + *
> + * Context: must be called with preemption disabled
> + */
> +void pks_write_current(void)
> +{
> + if (!cpu_feature_enabled(X86_FEATURE_PKS))
> + return;
> +
> + pks_write_pkrs(current->thread.pks_saved_pkrs);
> +}
> +
> /*
> * PKS is independent of PKU and either or both may be supported on a CPU.
> *