Re: [PATCH v5 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.
From: Kyle Huey
Date: Fri Aug 26 2022 - 00:48:07 EST
On Thu, Aug 18, 2022 at 2:19 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Aug 18, 2022, Kyle Huey wrote:
> > On Thu, Aug 18, 2022 at 3:57 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > > On Mon, Aug 08 2022 at 07:15, Kyle Huey wrote:
> > > > When management of the PKRU register was moved away from XSTATE, emulation
> > > > of PKRU's existence in XSTATE was added for APIs that read XSTATE, but not
> > > > for APIs that write XSTATE. This can be seen by running gdb and executing
> > > > `p $pkru`, `set $pkru = 42`, and `p $pkru`. On affected kernels (5.14+) the
> > > > write to the PKRU register (which gdb performs through ptrace) is ignored.
> > > >
> > > > There are three relevant APIs: PTRACE_SETREGSET with NT_X86_XSTATE,
> > > > sigreturn, and KVM_SET_XSAVE. KVM_SET_XSAVE has its own special handling to
> > > > make PKRU writes take effect (in fpu_copy_uabi_to_guest_fpstate). Push that
> > > > down into copy_uabi_to_xstate and have PTRACE_SETREGSET with NT_X86_XSTATE
> > > > and sigreturn pass in pointers to the appropriate PKRU value.
> > > >
> > > > This also adds code to initialize the PKRU value to the hardware init value
> > > > (namely 0) if the PKRU bit is not set in the XSTATE header to match XRSTOR.
> > > > This is a change to the current KVM_SET_XSAVE behavior.
> > >
> > > You are stating a fact here, but provide 0 justification why this is
> > > correct.
> >
> > Well, the justification is that this *is* the behavior we want for
> > ptrace/sigreturn, and it's very likely the existing KVM_SET_XSAVE
> > behavior in this edge case is an oversight rather than intentional,
> > and in the absence of confirmation that KVM wants the existing
> > behavior (the KVM mailing list and maintainer are CCd) one correct
> > code path is better than one correct code path and one buggy code
> > path.
>
> Sorry, I missed the KVM-relevant flags.
>
> Hrm, the current behavior has been KVM ABI for a very long time.
>
> It's definitely odd because all other components will be initialized due to their
> bits being cleared in the header during kvm_load_guest_fpu(), and it probably
> wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads.
> But, in theory, userspace could save/restore a subset of guest XSTATE and rely on
> the kernel not overwriting guest PKRU when its bit is cleared in the header.
This seems extremely conservative, but ok. As you note, PKRU is the
only XSTATE component you could theoretically do this subset
save/restore with in the KVM ABI since all the others really do have
their hardware behavior.
> All that said, I don't see any reason to force KVM to change at this time, it's
> trivial enough to handle KVM's oddities while providing sane behavior for others.
> Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to
> play nice with a NULL pointer, e.g.
>
> /*
> * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
> * in the header. KVM's odd ABI is to leave PKRU untouched in this
> * case (all other components are eventually re-initialized).
> */
> if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU))
> vpkru = NULL;
You meant ustate->... here (since this is before the copy now), but
yes, ok, I will do that.
> return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);
- Kyle