[RFC PATCH 0/3] sched/numa: Introduce per cgroup numa balance control

From: Chen Yu
Date: Tue Feb 25 2025 - 09:06:54 EST


Introduce a per-cgroup interface to enable NUMA balancing
for specific cgroups. The system administrator needs to set
the NUMA balancing mode to NUMA_BALANCING_CGROUP=4 to enable
this feature. When in the NUMA_BALANCING_CGROUP mode, all
cgroups' NUMA balancing is disabled by default. After the
administrator enables this feature for a specific cgroup,
NUMA balancing for that cgroup is enabled.

This per-cgroup NUMA balancing control was once proposed in
2019 by Yun Wang[1]. Then, in 2024, Kaiyang Zhao mentioned
that he was working with Meta on per-cgroup NUMA control[2]
during a discussion with David Rientjes.

I could not find further discussion regarding per-cgroup NUMA
balancing from that point on. This set of RFC patches is a
rough and compile-passed version, and may have unhandled cases
(for example, THP). It has not been thoroughly tested and is
intended to initiate or resume the discussion on the topic of
per-cgroup NUMA load balancing.

The first patch is a NUMA load balancing statistics enhancement.
The second patch introduces per-cgroup NUMA balancing. The third
one enhances NUMA load balancing for the MPOL_INTERLEAVE policy.

Any feedback would be appreciated.

[1] https://lore.kernel.org/linux-fsdevel/60b59306-5e36-e587-9145-e90657daec41@xxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/linux-mm/ZrukILyQhMAKWwTe@localhost.localhost/T/

Chen Yu (3):
sched/numa: Introduce numa balance task migration and swap in
schedstats
sched/numa: Introduce per cgroup numa balance control
sched/numa: Allow intervale memory allocation for numa balance

include/linux/numa.h | 1 +
include/linux/sched.h | 4 ++++
include/linux/sched/sysctl.h | 1 +
include/linux/vm_event_item.h | 2 ++
include/uapi/linux/mempolicy.h | 1 +
kernel/sched/core.c | 42 ++++++++++++++++++++++++++++++++--
kernel/sched/debug.c | 4 ++++
kernel/sched/fair.c | 18 +++++++++++++++
kernel/sched/sched.h | 3 +++
mm/memcontrol.c | 2 ++
mm/memory.c | 2 +-
mm/mempolicy.c | 7 ++++++
mm/mprotect.c | 5 ++--
mm/vmstat.c | 2 ++
14 files changed, 89 insertions(+), 5 deletions(-)

--
2.25.1