Re: [RFC PATCH v3 00/15] pkeys-based page table hardening

From: Kees Cook
Date: Thu Feb 06 2025 - 17:41:36 EST


On Mon, Feb 03, 2025 at 10:18:24AM +0000, Kevin Brodsky wrote:
> This is a proposal to leverage protection keys (pkeys) to harden
> critical kernel data, by making it mostly read-only. The series includes
> a simple framework called "kpkeys" to manipulate pkeys for in-kernel use,
> as well as a page table hardening feature based on that framework
> (kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of
> concept, but they are designed to be compatible with any architecture
> implementing pkeys.

Does QEMU support POE? The only mention I could find is here:
https://mail.gnu.org/archive/html/qemu-arm/2024-03/msg00486.html
where the answer is, "no and it looks difficult". :P

> # Threat model
>
> The proposed scheme aims at mitigating data-only attacks (e.g.
> use-after-free/cross-cache attacks). In other words, it is assumed that
> control flow is not corrupted, and that the attacker does not achieve
> arbitrary code execution. Nothing prevents the pkey register from being
> set to its most permissive state - the assumption is that the register
> is only modified on legitimate code paths.

Do you have any tests that could be added to drivers/misc/lkdtm that
explicitly exercise the protection? That is where many hardware security
features get tested. (i.e. a successful test will generally trigger a
BUG_ON or similar.)

> The arm64 implementation should be considered a proof of concept only.
> The enablement of POE for in-kernel use is incomplete; in particular
> POR_EL1 (pkey register) should be reset on exception entry and restored
> on exception return.

As in, make sure the loaded pkey isn't leaked into an exception handler?

> # Open questions
>
> A few aspects in this RFC that are debatable and/or worth discussing:
>
> - There is currently no restriction on how kpkeys levels map to pkeys
> permissions. A typical approach is to allocate one pkey per level and
> make it writable at that level only. As the number of levels
> increases, we may however run out of pkeys, especially on arm64 (just
> 8 pkeys with POE). Depending on the use-cases, it may be acceptable to
> use the same pkey for the data associated to multiple levels.
>
> Another potential concern is that a given piece of code may require
> write access to multiple privileged pkeys. This could be addressed by
> introducing a notion of hierarchy in trust levels, where Tn is able to
> write to memory owned by Tm if n >= m, for instance.
>
> - kpkeys_set_level() and kpkeys_restore_pkey_reg() are not symmetric:
> the former takes a kpkeys level and returns a pkey register value, to
> be consumed by the latter. It would be more intuitive to manipulate
> kpkeys levels only. However this assumes that there is a 1:1 mapping
> between kpkeys levels and pkey register values, while in principle
> the mapping is 1:n (certain pkeys may be used outside the kpkeys
> framework).

Is the "levels" nature of this related to how POE behaves? It sounds
like there can only be 1 pkey active at a time (a role), rather than
each pkey representing access to a specific set of pages (a key in a
keyring), where many pkeys could be active at the same time. Am I
understanding that correctly?

> Any comment or feedback will be highly appreciated, be it on the
> high-level approach or implementation choices!

As hinted earlier with my QEMU question... what's the best way I can I
test this myself? :)

Thanks for working on this! Data-only attacks have been on the rise for
a while now, and I'm excited to see some viable mitigations appearing.
Yay!

-Kees

--
Kees Cook