Re: [RFC PATCH] x86/mm: Disable PTI for kernel_ident_mapping_init()

From: David Woodhouse
Date: Tue Nov 26 2024 - 06:42:45 EST


On Mon, 2024-11-25 at 11:13 -0800, Dave Hansen wrote:
> On 11/25/24 10:53, David Woodhouse wrote:
> > > I think we have a lot of software-available space in the page table
> > > pointer entries. What would folks think if we set a special bit in those
> > > p4d entries that said:
> > >
> > > "I don't need to be propagated to
> > > the user portion of the page tables."
> > >
> > > It would obviously get set in this code that you're trying to fix. It
> > > might _also_ be able to be set in in "_USR", like here:
> > >
> > > #define _KERNPG_TABLE_NOENC  (__PP|__RW|   0|___A|   0|___D|   0|   0)
> > > #define _PAGE_TABLE_NOENC    (__PP|__RW|_USR|___A|   0|___D|   0|   0)
> > >
> > > like:
> > >
> > > #define _USR _PAGE_USER|_PAGE_SW_WHATEVER
> > In fact, do we even need a separate bit? Any PTE without the _PAGE_USER
> > bit set clearly doesn't need to be mirrored into the user page
> > tables...?
>
> I can't think of any exceptions where this would break off the top of my
> head. It seems too simple to work. ;)

On IRC we discussed the fact that it's slightly non-trivial to use that
as the trigger for pgd_clear().

I threw this version together and it didn't immediately explode...

Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 6f82e75b6149..4b804531b03c 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -36,10 +36,12 @@
#define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4

#ifdef CONFIG_X86_64
-#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit */
+#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit (leaf) */
+#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW5 /* No PTI shadow (root PGD) */
#else
/* Shared with _PAGE_BIT_UFFD_WP which is not supported on 32 bit */
-#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW2 /* Saved Dirty bit */
+#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW2 /* Saved Dirty bit (leaf) */
+#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW2 /* No PTI shadow (root PGD) */
#endif

/* If _PAGE_BIT_PRESENT is clear, we use these: */
@@ -139,6 +141,8 @@

#define _PAGE_PROTNONE (_AT(pteval_t, 1) << _PAGE_BIT_PROTNONE)

+#define _PAGE_NOPTISHADOW (_AT(pteval_t, 1) << _PAGE_BIT_NOPTISHADOW)
+
/*
* Set of bits not changed in pte_modify. The pte's
* protection key is treated like _PAGE_RW, for
diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c
index 437e96fb4977..5ab7bd2f1983 100644
--- a/arch/x86/mm/ident_map.c
+++ b/arch/x86/mm/ident_map.c
@@ -174,7 +174,7 @@ static int ident_p4d_init(struct x86_mapping_info *info, p4d_t *p4d_page,
if (result)
return result;

- set_p4d(p4d, __p4d(__pa(pud) | info->kernpg_flag));
+ set_p4d(p4d, __p4d(__pa(pud) | info->kernpg_flag | _PAGE_NOPTISHADOW));
}

return 0;
@@ -218,14 +218,14 @@ int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page,
if (result)
return result;
if (pgtable_l5_enabled()) {
- set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag));
+ set_pgd(pgd, __pgd(__pa(p4d) | info->kernpg_flag | _PAGE_NOPTISHADOW));
} else {
/*
* With p4d folded, pgd is equal to p4d.
* The pgd entry has to point to the pud page table in this case.
*/
pud_t *pud = pud_offset(p4d, 0);
- set_pgd(pgd, __pgd(__pa(pud) | info->kernpg_flag));
+ set_pgd(pgd, __pgd(__pa(pud) | info->kernpg_flag | _PAGE_NOPTISHADOW));
}
}

diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 851ec8f1363a..5f0d579932c6 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -132,7 +132,7 @@ pgd_t __pti_set_user_pgtbl(pgd_t *pgdp, pgd_t pgd)
* Top-level entries added to init_mm's usermode pgd after boot
* will not be automatically propagated to other mms.
*/
- if (!pgdp_maps_userspace(pgdp))
+ if (!pgdp_maps_userspace(pgdp) || (pgd.pgd & _PAGE_NOPTISHADOW))
return pgd;

/*

Attachment: smime.p7s
Description: S/MIME cryptographic signature