[PATCH 4/4] mm: align file-backed mmap to exec folio order in thp_get_unmapped_area
From: Usama Arif
Date: Tue Mar 10 2026 - 12:12:20 EST
thp_get_unmapped_area() is the get_unmapped_area callback for
filesystems like ext4, xfs, and btrfs. It attempts to align the virtual
address for PMD_SIZE THP mappings, but on arm64 with 64K base pages
PMD_SIZE is 512M, which is too large for typical shared library mappings,
so the alignment always fails and falls back to PAGE_SIZE.
This means shared libraries loaded by ld.so via mmap() get 64K-aligned
virtual addresses, preventing contpte mapping even when 2M folios are
allocated with properly aligned file offsets and physical addresses.
Add a fallback in thp_get_unmapped_area_vmflags() that tries
PAGE_SIZE << exec_folio_order() alignment (2M on arm64 64K pages)
when PMD_SIZE alignment fails. This is small enough that shared
libraries could qualify, enabling contpte mapping for their executable
segments.
This applies to all file-backed mappings (not just exec). Non-exec
file-backed mappings also benefit from contpte mapping when large
folios are used. Aligning all file-backed mappings ensures that any
large folio in the page cache can be contpte-mapped regardless of
the mapping's protection flags, reducing dTLB misses for read-heavy
workloads.
The fallback is gated by exec_folio_order() which returns 0 by default,
making this a no-op on architectures that don't define it.
Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
---
mm/huge_memory.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..1c9476a5ed51c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1242,6 +1242,23 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add
if (ret)
return ret;
+ /*
+ * If the arch requested large folios for exec memory, try to align
+ * to the folio size as a fallback. This is much smaller than PMD_SIZE
+ * (e.g. 2M vs 512M on arm64 64K pages), so it succeeds for mappings
+ * that are too small for PMD alignment. Proper alignment ensures that
+ * the hardware can coalesce PTEs (e.g. arm64 contpte) when large
+ * folios are mapped.
+ */
+ if (exec_folio_order()) {
+ unsigned long folio_size = PAGE_SIZE << exec_folio_order();
+
+ ret = __thp_get_unmapped_area(filp, addr, len, off, flags,
+ folio_size, vm_flags);
+ if (ret)
+ return ret;
+ }
+
return mm_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags,
vm_flags);
}
--
2.47.3