Re: [PATCH v6 00/30] pkeys-based page table hardening

From: Kevin Brodsky

Date: Wed Apr 15 2026 - 11:50:48 EST


On 15/04/2026 14:48, David Hildenbrand (Arm) wrote:
> On 2/27/26 18:54, Kevin Brodsky wrote:
>> NEW in v6: support for large block mappings through a dedicated page table
>> allocator (patch 14-17)
> Heh, I had to read till the very end to realize that this is an RFC, and
> then saw your other mail.
>
> I can recommend using b4 for patch management, where you can configure a
> sticky prefix through
>
> b4 prep --set-prefixes RFC
>
> And using "b4 send" to automate all the rest.

I certainly should... sorry for the confusion!

>> Threat model
>> ============
>>
>> The proposed scheme aims at mitigating data-only attacks (e.g.
>> use-after-free/cross-cache attacks). In other words, it is assumed that
>> control flow is not corrupted, and that the attacker does not achieve
>> arbitrary code execution. Nothing prevents the pkey register from being
>> set to its most permissive state - the assumption is that the register
>> is only modified on legitimate code paths.
>>
>> A few related notes:
>>
>> - Functions that set the pkey register are all implemented inline.
>> Besides performance considerations, this is meant to avoid creating
>> a function that can be used as a straightforward gadget to set the
>> pkey register to an arbitrary value.
>>
>> - kpkeys_set_level() only accepts a compile-time constant as argument,
>> as a variable could be manipulated by an attacker. This could be
>> relaxed but it seems unlikely that a variable kpkeys level would be
>> needed in practice.
>>
> I see a lot of value for that also as a debugging mechanism. I hear that
> other people had private patches that would attempt to only map leaf
> pages in the direct map in pte_offset_map_lock() and friends. I assume
> there are some tricky bits to that (concurrent access to page tables).

Indeed, this should be a much better solution, not only because it means
a lot fewer TLBIs, but also because it is truly per-thread (so
concurrency is not a concern).

> What's the general take regarding the thread model you describe vs. MTE?

I'd say quite similar, although corrupting pointers (specifically the
tag bits) remains possible in a data-only attack, while corrupting the
POR_EL1 register would require some control flow hijack (only constant
values are written to POR_EL1).

> Regarding use-after-free, I'd assume KASAN would achieve something
> similar. And with MTE "reasonably" fast. Or what is the biggest
> difference you see, there?

For use-after-free specifically, yes that sounds about right.

> I'd assume that one difference would be, that not even match-all
> pointers could accidentally modify page tables.

Yep that's pretty much what I tried to say above - with pkeys you have
to corrupt a system register to bypass the protection.

> In the future, would you think that both mechanisms (pkey PT table
> protection + KASAN) would be active at the same time, or wouldn't there
> really be a lot of value in having both enabled?

I think these are fairly orthogonal, KASAN gives you probabilistic
spatial+temporal safety for most allocations, while kpkeys restricts
access to key data to a small set of functions. I don't think one
reduces the usefulness of the other. Of course KASAN makes it harder to
use an arbitrary pointer to write to page tables, but kpkeys gives a
clear guarantee (assuming CFI is preserved).

> [...]
>
>>
>> Open questions
>> ==============
>>
>> A few aspects in this RFC that are debatable and/or worth discussing:
>>
>> - Can the pkeys block allocator be abstracted into something more
>> generic? This seems desirable considering other use-cases for changing
>> attributes of regions of the linear map, but the handling of page
>> tables while splitting may be difficult to integrate in a generic
>> allocator.
>>
>> - There is currently no restriction on how kpkeys levels map to pkeys
>> permissions. A typical approach is to allocate one pkey per level and
>> make it writable at that level only. As the number of levels
>> increases, we may however run out of pkeys, especially on arm64 (just
>> 8 pkeys with POE). Depending on the use-cases, it may be acceptable to
>> use the same pkey for the data associated to multiple levels.
>>
>>
>> Any comment or feedback is highly appreciated, be it on the high-level
>> approach or implementation choices!
> How crucial would the dedicated page table allocator be for a first up
> streamed version?
>
> Assuming we introduce this as a debugging feature first, it would be
> perfectly reasonable to just disallow large block mappings in the direct
> map when enabled.
>
> That means, we could merge basic support first and think about how to
> deal with page tables in a different way with most of the pkey details
> out of the picture.

I think that makes perfect sense, at least on arm64 where it's just a
matter of configuring force_pte_mapping() appropriately. I'm not sure
whether there is such an option on x86, though.

- Kevin