Re: [PATCH RFC v3 2/4] mm/pgtable: Make pfn_pte() filter out huge page attributes

From: Yin Tirui

Date: Wed Mar 04 2026 - 05:08:58 EST




On 3/4/2026 3:52 PM, Jürgen Groß wrote:
On 28.02.26 08:09, Yin Tirui wrote:
A fundamental principle of page table type safety is that `pte_t` represents
the lowest level page table entry and should never carry huge page attributes.

Currently, passing a pgprot with huge page bits (e.g., extracted via
pmd_pgprot()) into pfn_pte() creates a malformed PTE that retains the huge
attribute, leading to the necessity of the ugly `pte_clrhuge()` anti- pattern.

Enforce type safety by making `pfn_pte()` inherently filter out huge page
attributes:
- On x86: Strip the `_PAGE_PSE` bit.
- On ARM64: Mask out the block descriptor bits in `PTE_TYPE_MASK` and
   enforce the `PTE_TYPE_PAGE` format.
- On RISC-V: No changes required, as RISC-V leaf PMDs and PTEs share the
   exact same hardware format and do not use a distinct huge bit.

Signed-off-by: Yin Tirui <yintirui@xxxxxxxxxx>
---
  arch/arm64/include/asm/pgtable.h | 4 +++-
  arch/x86/include/asm/pgtable.h   | 4 ++++
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/ asm/pgtable.h
index b3e58735c49b..f2a7a40106d2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -141,7 +141,9 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
  #define pte_pfn(pte)        (__pte_to_phys(pte) >> PAGE_SHIFT)
  #define pfn_pte(pfn,prot)    \
-    __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
+    __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | \
+        ((pgprot_val(prot) & ~(PTE_TYPE_MASK & ~PTE_VALID)) | \
+        (PTE_TYPE_PAGE & ~PTE_VALID)))
  #define pte_none(pte)        (!pte_val(pte))
  #define pte_page(pte)        (pfn_to_page(pte_pfn(pte)))
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/ pgtable.h
index 1662c5a8f445..a4dbd81d42bf 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -738,6 +738,10 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
  static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
  {
      phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
+
+    /* Filter out _PAGE_PSE to ensure PTEs never carry the huge page bit */
+    pgprot = __pgprot(pgprot_val(pgprot) & ~_PAGE_PSE);

Is it really a good idea to silently drop the bit?

Today it can either be used for a large page (which should be a pmd,
of course), or - much worse - you'd strip the _PAGE_PAT bit, which is
at the same position in PTEs.

So basically you are removing the ability to use some cache modes.

NACK!


Juergen

Hi Jürgen,

You are absolutely right. I missed the fact that `_PAGE_PSE` aliases with `_PAGE_PAT` on 4K PTEs.

The intention here was to follow previous feedback to enforce type safety by filtering out huge page attributes directly inside `pfn_pte()`. However, doing it this way obviously breaks the cache modes on x86.

I agree with the NACK. I will drop this approach and rethink how to handle the huge-to-normal pgprot conversion safely for v4.

--
Thanks,
Yin Tirui