[PATCH v2 0/4] mm: split the file's i_mmap tree for NUMA

From: Huang Shijie

Date: Thu Jun 11 2026 - 02:48:51 EST


In NUMA, there are maybe many NUMA nodes and many CPUs.
For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs.
In the UnixBench tests, there is a test "execl" which tests
the execve system call.

When we test our server with "./Run -c 384 execl",
the test result is not good enough. The i_mmap locks contended heavily on
"libc.so" and "ld.so". For example, the i_mmap tree for "libc.so" can have
over 6000 VMAs, all the VMAs can be in different NUMA mode.
The insert/remove operations do not run quickly enough.

patch 1 & patch 2 are try to hide the direct access of i_mmap.
patch 3 splits the i_mmap into sibling trees, each tree has separate lock,
and we can get better performance with this patch set in our NUMA server:
we can get over 400% performance improvement.

I did not test the non-NUMA case, since I do not have such server.

v1 --> v2:
Not only split the immap tree, but also split the lock.
v1 : https://lkml.org/lkml/2026/4/13/199

Huang Shijie (4):
mm: use mapping_mapped to simplify the code
mm: use get_i_mmap_root to access the file's i_mmap
mm/fs: split the file's i_mmap tree
docs/mm: update document for split i_mmap tree

Documentation/mm/process_addrs.rst | 63 +++++++---
arch/arm/mm/fault-armv.c | 3 +-
arch/arm/mm/flush.c | 3 +-
arch/nios2/mm/cacheflush.c | 3 +-
arch/parisc/kernel/cache.c | 4 +-
fs/Kconfig | 8 ++
fs/dax.c | 3 +-
fs/hugetlbfs/inode.c | 30 +++--
fs/inode.c | 75 +++++++++++-
include/linux/fs.h | 179 ++++++++++++++++++++++++++++-
include/linux/mm.h | 81 +++++++++++++
include/linux/mm_types.h | 3 +
kernel/events/uprobes.c | 3 +-
mm/hugetlb.c | 7 +-
mm/internal.h | 3 +-
mm/khugepaged.c | 6 +-
mm/memory-failure.c | 8 +-
mm/memory.c | 8 +-
mm/mmap.c | 11 +-
mm/nommu.c | 28 +++--
mm/pagewalk.c | 4 +-
mm/rmap.c | 2 +-
mm/vma.c | 74 +++++++++---
mm/vma_init.c | 3 +
24 files changed, 534 insertions(+), 78 deletions(-)

--
2.53.0