[PATCH v8 3/5] x86/pkeys: Update PKRU to enable all pkeys before XSAVE

From: Dmitry Vyukov
Date: Tue Feb 04 2025 - 05:02:19 EST


Re commit 70044df250d022572e26cd301bddf75eac1fe50e:
https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@xxxxxxxxxx/

> If the alternate signal stack is protected by a different pkey than the
> current execution stack, copying xsave data to the sigaltstack will fail
> if its pkey is not enabled in the PKRU register.
>
> We do not know which pkey was used by the application for the altstack,
> so enable all pkeys before xsave.
>
> But this updated PKRU value is also pushed onto the sigframe, which
> means the register value restored from sigcontext will be different from
> the user-defined one, which is unexpected. Fix that by overwriting the
> PKRU value on the sigframe with the original, user-defined PKRU.

Hi,

This unfortunatly seems to be broken for rseq user-space writes.
If the signal is caused by rseq struct being inaccessible due to PKEYs,
we try to write to rseq again at setup_rt_frame->rseq_signal_deliver,
which happens _before_ sig_prepare_pkru and won't succeed
(PKEY is still inaccessible, hard kills the process).
Any PKEY sandbox would want to restict untrusted access to rseq
as well (otherwise allows easy sandbox escapes).

If we do sig_prepare_pkru before rseq_signal_deliver (and generally
before any copy_to_userpace), then user-space handler gets SIGSEGV
and could unregister rseq and retry.

However, I am not sure if it's the best solution performance-
and complexity-wise (for user-space). A better solution may be to
change __rseq_handle_notify_resume to temporary switch to default
PKEY if user accesses fail.
Rseq is similar to signals in this respect. Since rseq updates
happen asynchronously with respect to user-space control flow,
if a program uses rseq and ever makes rseq inaccessible with PKEYs,
it's in trouble and will be randomly killed.
Since rseq updates are asynchronous as signals, they shouldn't
assume PKEY is set to default value that allows access
to rseq descriptor.

Thoughts?