[PATCH 0/3] mm: split the file's i_mmap tree for NUMA
From: Huang Shijie
Date: Mon Apr 13 2026 - 02:39:42 EST
In NUMA, there are maybe many NUMA nodes and many CPUs.
For example, a Hygon's server has 12 NUMA nodes, and 384 CPUs.
In the UnixBench tests, there is a test "execl" which tests
the execve system call.
When we test our server with "./Run -c 384 execl",
the test result is not good enough. The i_mmap locks contended heavily on
"libc.so" and "ld.so". For example, the i_mmap tree for "libc.so" can have
over 6000 VMAs, all the VMAs can be in different NUMA mode.
The insert/remove operations do not run quickly enough.
patch 1 & patch 2 are try to hide the direct access of i_mmap.
patch 3 splits the i_mmap into sibling trees, and we can get better
performance with this patch set:
we can get 77% performance improvement(10 times average)
Huang Shijie (3):
mm: use mapping_mapped to simplify the code
mm: use get_i_mmap_root to access the file's i_mmap
mm: split the file's i_mmap tree for NUMA
arch/arm/mm/fault-armv.c | 3 ++-
arch/arm/mm/flush.c | 3 ++-
arch/nios2/mm/cacheflush.c | 3 ++-
arch/parisc/kernel/cache.c | 4 ++-
fs/dax.c | 3 ++-
fs/hugetlbfs/inode.c | 10 +++----
fs/inode.c | 55 +++++++++++++++++++++++++++++++++++++-
include/linux/fs.h | 40 +++++++++++++++++++++++++++
include/linux/mm.h | 33 +++++++++++++++++++++++
include/linux/mm_types.h | 1 +
kernel/events/uprobes.c | 3 ++-
mm/hugetlb.c | 7 +++--
mm/khugepaged.c | 6 +++--
mm/memory-failure.c | 8 +++---
mm/memory.c | 8 +++---
mm/mmap.c | 3 ++-
mm/nommu.c | 11 +++++---
mm/pagewalk.c | 2 +-
mm/rmap.c | 2 +-
mm/vma.c | 36 +++++++++++++++++++------
mm/vma_init.c | 1 +
21 files changed, 204 insertions(+), 38 deletions(-)
--
2.43.0