[PATCH v2 0/4] mm/hugetlb: fixes for PMD table sharing (incl. using mmu_gather)

From: David Hildenbrand (Red Hat)

Date: Fri Dec 12 2025 - 02:10:24 EST


One functional fix, one performance regression fix, and two related
comment fixes.

I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.

The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.

Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().

The last patch is all about TLB flushes, IPIs and mmu_gather.
Read: complicated

I added as much comments + description that I possibly could, and I am
hoping for review from Jann.

There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.

Compile tested on plenty of architectures.

Runtime tested, with a focus on fixing the performance regression using
the original reproducer [2] on x86.

I'm still busy with more testing (making sure that my TLB flushing changes
are good), but sending this out already so people can test and review
while I am soon heading for LPC.

[1] https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@xxxxxxxxxx/
[2] https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@xxxxxxxxxx/

--

v1 -> v2:
* Picked RB's/ACK's, hopefully I didn't miss any
* Added the initialization of fully_unshared_tables in __tlb_gather_mmu()
(Thanks Nadav!)
* Refined some comments based on Lorenzo's feedback.

Sending it out already as I have some spare minutes and we should start
queuing the fixed version. Maybe there will be some more comment changes
later based on the discussion with Lorenzo.

Cc: Will Deacon <will@xxxxxxxxxx>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Nick Piggin <npiggin@xxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Muchun Song <muchun.song@xxxxxxxxx>
Cc: Oscar Salvador <osalvador@xxxxxxx>
Cc: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Jann Horn <jannh@xxxxxxxxxx>
Cc: Pedro Falcato <pfalcato@xxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Harry Yoo <harry.yoo@xxxxxxxxxx>
Cc: Uschakow, Stanislav" <suschako@xxxxxxxxx>
Cc: Laurence Oberman <loberman@xxxxxxxxxx>
Cc: Prakash Sangappa <prakash.sangappa@xxxxxxxxxx>
Cc: Nadav Amit <nadav.amit@xxxxxxxxx>

David Hildenbrand (Red Hat) (4):
mm/hugetlb: fix hugetlb_pmd_shared()
mm/hugetlb: fix two comments related to huge_pmd_unshare()
mm/rmap: fix two comments related to huge_pmd_unshare()
mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables
using mmu_gather

include/asm-generic/tlb.h | 74 +++++++++++++++++++++-
include/linux/hugetlb.h | 21 ++++---
mm/hugetlb.c | 129 ++++++++++++++++++++------------------
mm/mmu_gather.c | 7 +++
mm/mprotect.c | 2 +-
mm/rmap.c | 45 +++++++------
6 files changed, 184 insertions(+), 94 deletions(-)

--
2.52.0