Re: XSAVE / RDPKRU on Intel 11th Gen Core CPUs
From: Brian Geffon
Date: Tue Nov 09 2021 - 14:26:40 EST
On Tue, Nov 9, 2021 at 1:58 PM Brian Geffon <bgeffon@xxxxxxxxxx> wrote:
>
> Hi Andy,
>
> On Tue, Nov 9, 2021 at 9:58 AM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > Here's an excerpt from an old email that I, perhaps unwisely, sent to Dave but not to a public list:
> >
> > static inline void write_pkru(u32 pkru)
> > {
> > struct pkru_state *pk;
> >
> > if (!boot_cpu_has(X86_FEATURE_OSPKE))
> > return;
> >
> > pk = get_xsave_addr(¤t->thread.fpu.state.xsave,
> > XFEATURE_PKRU);
> >
> > /*
> > * The PKRU value in xstate needs to be in sync with the value
> > that is
> > * written to the CPU. The FPU restore on return to userland would
> > * otherwise load the previous value again.
> > */
> > fpregs_lock();
> > if (pk)
> > pk->pkru = pkru;
> >
> > ^^^
> > else we just write to the PKRU register but leave XINUSE[PKRU] clear on
> > return to usermode? That seems... unwise.
> >
> > __write_pkru(pkru);
> > fpregs_unlock();
> > }
> >
> > I bet you're hitting exactly this bug. The fix ended up being a whole series of patches, but the gist of it is that the write_pkru() slow path needs to set the xfeature bit in the xsave buffer and then do the write. It should be possible to make a little patch to do just this in a couple lines of code.
>
> I think you've got the right idea, the following patch does seem to
> fix the problem on this CPU, this is based on 5.13. It seems the
> changes to asm/pgtable.h were not enough, I also had to modify
> fpu/internal.h to get it working properly.
>
Actually, it seems that only the changes to fpu/internal.h seem
necessary. I guess the switch_fpu_finish explains how it's reverting
to the initial value.
diff --git a/arch/x86/include/asm/fpu/internal.h
b/arch/x86/include/asm/fpu/internal.h
index 16bf4d4a8159..ed2ce7d1afeb 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -564,18 +564,16 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
* PKRU state is switched eagerly because it needs to be valid before we
* return to userland e.g. for a copy_to_user() operation.
*/
- if (!(current->flags & PF_KTHREAD)) {
- /*
- * If the PKRU bit in xsave.header.xfeatures is not set,
- * then the PKRU component was in init state, which means
- * XRSTOR will set PKRU to 0. If the bit is not set then
- * get_xsave_addr() will return NULL because the PKRU value
- * in memory is not valid. This means pkru_val has to be
- * set to 0 and not to init_pkru_value.
- */
- pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
- pkru_val = pk ? pk->pkru : 0;
- }
+ /*
+ * If the PKRU bit in xsave.header.xfeatures is not set,
+ * then the PKRU component was in init state, which means
+ * XRSTOR will set PKRU to 0. If the bit is not set then
+ * get_xsave_addr() will return NULL because the PKRU value
+ * in memory is not valid. This means pkru_val has to be
+ * set to 0 and not to init_pkru_value.
+ */
+ pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
+ pkru_val = pk ? pk->pkru : 0;
__write_pkru(pkru_val);
}