Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE

From: Jeff Xu
Date: Thu Apr 25 2024 - 20:12:55 EST


On Thu, Apr 25, 2024 at 3:49 PM Aruna Ramakrishna
<aruna.ramakrishna@xxxxxxxxxx> wrote:
>
>
>
> > On Apr 25, 2024, at 2:05 PM, jeffxu@xxxxxxxxxxxx wrote:
> >
> > From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> >
> > On 3/21/24 14:56, Aruna Ramakrishna wrote:
> >> Enabling both the non-zero pkey (for the thread) and pkey zero (in
> >> userspace) will not work for us. We cannot have the alt stack writeable
> >> by all - the rationale here is that the code running in that thread
> >> (using a non-zero pkey) is untrusted and should not have access to the
> >> alternate signal stack (that uses pkey zero), to prevent the return
> >> address of a function from being changed. The expectation is that kernel
> >> should be able to set up the alternate signal stack and deliver the
> >> signal to the application even if pkey zero is explicitly disabled by
> >> the application. The signal handler accessibility should not be dictated
> >> by the PKRU value that the thread sets up.
> >>
> > We have a similar threat model that we don't want "untrusted threads" to
> > access altstack. I think this patch need not be restricted to the
> > use case of zero pkey for altstack, i.e. application can also set
> > non-zero pkey to altstack and expect the same.
>
> Agreed. In the latest version of this patchset, this assumption has been removed.
>
> Link here:
> https://lore.kernel.org/lkml/20240425180542.1042933-1-aruna.ramakrishna@xxxxxxxxxx/T/#t
>
> >
> >> Solution:
> >> The PKRU register is managed by XSAVE, which means the sigframe contents
> >> must match the register contents - which is not the case here. We want
> >> the sigframe to contain the user-defined PKRU value (so that it is
> >> restored correctly from sigcontext) but the actual register must be
> >> reset to init_pkru so that the alt stack is accessible and the signal
> >> can be delivered to the application. It seems that the proper fix here
> >> would be to remove PKRU from the XSAVE framework and manage it
> >> separately, which is quite complicated. As a workaround, this patch does
> >> something like this:
> >>
> >> orig_pkru = rdpkru();
> >> wrpkru(init_pkru & orig_pkru);
> >> xsave_to_user_sigframe();
> >> put_user(pkru_sigframe_addr, orig_pkru)
> >>
> > The default PKRU of thread [1] is set as 01 (disable access) for each PKEY
> > from 1 to 15, and 00 (RW) for PKEY 0.
> >
> > Let's use pkey 1 as an example:
> > The init_pkru is 01, if the thread has PKRU (orig_pkru) as 10 (disable write
> > but have read) then new_pkru from (init_pkru & orig_pkru) is 00, which gives
> > RW access to the pkey 1.
> >
> > When the thread has orig_pkru as 01 (disable access) or 00 (RW), new_pkru is
> > unchanged from orig_pkru.
> >
> > Now take pkey 0:
> > the init_pkru is 00, regardless what threads has, new_pkru will always be 00.
> >
> > This seems to work out well for pkey 1 to 15, i.e. signal handing code in
> > kernel only give write access when the thread alrady has read access to the
> > PKEY that is used by the altstack. The threat model interesting here is to
> > prevent untrusted threads from writing to altstack, and read is probably less
> > of a problem.
> >
>
> This piece of code assumed that the init PKRU value allows writes to the alternative
> signal stack. As you mentioned earlier, that may not always be true - a non-zero pkey
> can be used for the altstack.
>
Only PKEY 0 has init PKRU as 00.
So in your case, the thread doesn't have write access to pkey 0, and
need the write
access to pkey 0 during signal handling.

> So the new version simply does write_pkru(0) (i.e. enabled all pkeys) before XSAVE.
> Is this more reasonable?
>
write_pkru(0) will work, but it is unnecessary in the current patch.

Consider the following case: A thread has no access to pkey 1, and
use pkey 1 for its altstack.

In V3 (use write_pkru(0):
Signal will be dispatched to the user, i.e. write to signal frame is
OK, but it will SEGV at sigreturn.

In V2:
it will SEGV earlier at dispatch stage when writing to sigframe.

I would rather that the code fails earlier for this case.

> >
> > Does this meet what you want? (Note the pkey 0 is different than 1-15)
> >
> > Suppose someone also like to disable all access to altstack, then there is one
> > more place to mind: in sigreturn(), it calls restore_altstack(), and requires
> > read access to altstack. However, at the time, PKRU is already restored from
> > sigframe, so SEGV will raise (the value in sigframe doesn't have read access
> > to the PKEY).
> >
> > Without changing sigreturn, using wrpkru(0) here might not be necessary:
> > the dispatch to user space works fine, only to crash at sigreturn step.
> >
> > [1] defined by init_pkru_value in pkeys.c
> >
> > Best regards,
> > -Jeff
>
>
> I see what you're saying. In rt_sigreturn():
>
> if (!restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags)) <— restores PKRU, disabling access to altstack
> goto badframe;
> ...
> if (restore_altstack(&frame->uc.uc_stack)) <— needs read access to altstack
> goto badframe;
>
>
> I’m wary about reordering anything in here. Also, this code is not aware of the altstack permissions. I’m wondering if wrpkru(0) is needed here too.
>
We can't change PKRU after restore_sigcontext, the calling thread
would have PKRU 0, not the original PKRU from before handling the
signal.