[PATCH v2 0/5] move per-vma lock into vm_area_struct

From: Suren Baghdasaryan
Date: Tue Nov 12 2024 - 14:46:45 EST


Back when per-vma locks were introduces, vm_lock was moved out of
vm_area_struct in [1] because of the performance regression caused by
false cacheline sharing. Recent investigation [2] revealed that the
regressions is limited to a rather old Broadwell microarchitecture and
even there it can be mitigated by disabling adjacent cacheline
prefetching, see [3].
This patchset moves vm_lock back into vm_area_struct, aligning it at the
cacheline boundary and changing the cache to be cache-aligned as well.
This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40
(vm_lock) bytes to 256 bytes:

slabinfo before:
<name> ... <objsize> <objperslab> <pagesperslab> : ...
vma_lock ... 40 102 1 : ...
vm_area_struct ... 160 51 2 : ...

slabinfo after moving vm_lock:
<name> ... <objsize> <objperslab> <pagesperslab> : ...
vm_area_struct ... 256 32 2 : ...

Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages,
which is 5.5MB per 100000 VMAs. This overhead will be addressed in a
separate patchset by replacing rw_semaphore in vma_lock's implementation
with a different type of lock.
Moving vm_lock into vm_area_struct lets us simplify vm_area_free() path,
which in turn allows us to use SLAB_TYPESAFE_BY_RCU for vm_area_struct
cache. This should facilitate vm_area_struct reuse and will minimize the
number of call_rcu() calls.

Suren Baghdasaryan (5):
mm: introduce vma_start_read_locked{_nested} helpers
mm: move per-vma lock into vm_area_struct
mm: mark vma as detached until it's added into vma tree
mm: make vma cache SLAB_TYPESAFE_BY_RCU
docs/mm: document latest changes to vm_lock

Documentation/mm/process_addrs.rst | 10 +++--
include/linux/mm.h | 54 +++++++++++++++++-----
include/linux/mm_types.h | 16 ++++---
include/linux/slab.h | 6 ---
kernel/fork.c | 72 +++++++-----------------------
mm/memory.c | 2 +-
mm/mmap.c | 2 +
mm/nommu.c | 2 +
mm/userfaultfd.c | 14 +++---
mm/vma.c | 3 ++
tools/testing/vma/vma_internal.h | 3 +-
11 files changed, 92 insertions(+), 92 deletions(-)


base-commit: 931086f2a88086319afb57cd3925607e8cda0a9f
--
2.47.0.277.g8800431eea-goog