Re: [RFC PATCH v4 05/20] x86/mm: Add _PAGE_NOPTISHADOW bit to avoid updating userspace page tables

From: David Woodhouse
Date: Tue Dec 03 2024 - 13:24:43 EST


On Wed, 2024-11-27 at 19:00 +0000, David Woodhouse wrote:
> From: David Woodhouse <dwmw@xxxxxxxxxxxx>
>
> The set_p4d() and set_pgd() functions (in 4-level or 5-level page table setups
> respectively) assume that the root page table is actually a 8KiB allocation,
> with the userspace root immediately after the kernel root page table (so that
> the former can enforce NX on on all the subordinate pages, which are actually
> shared).
>
> However, users of the kernel_ident_mapping_init() code do not give it an 8KiB
> allocation for its PGD. Both swsusp_arch_resume() and acpi_mp_setup_reset()
> allocate only a single 4KiB page. The kexec code on x86_64 currently gets
> away with it purely by chance, because it allocates 8KiB for its "control
> code page" and then actually uses the first half for the PGD, then copies the
> actual trampoline code into the second half only after the identmap code has
> finished scribbling over it.
>
> Fix this by defining a _PAGE_NOPTISHADOW bit (which can use the same bit as
> _PAGE_SAVED_DIRTY since one is only for the PGD/P4D root and the other is
> exclusively for leaf PTEs.). This instructs __pti_set_user_pgtbl() not to
> write to the userspace 'shadow' PGD.
>
> Strictly, the _PAGE_NOPTISHADOW bit doesn't need to be written out to the
> actual page tables; since __pti_set_user_pgtbl() returns the value to be
> written to the kernel page table, it could be filtered out. But there seems
> to be no benefit to actually doing so.

Ping? I think the rest of the kexec-debug series is in fairly good
shape; this is the only part I'm slightly unsure about.

Attachment: smime.p7s
Description: S/MIME cryptographic signature