Re: [PATCH v5 0/2] sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems

From: Aithal, Srikanth
Date: Thu Apr 24 2025 - 03:52:40 EST


On 4/24/2025 8:15 AM, Libo Chen wrote:
v1->v2:
1. add perf improvment numbers in commit log. Yet to find perf diff on
will-it-scale, so not included here. Plan to run more workloads.
2. add tracepoint.
3. To peterz's comment, this will make it impossible to attract tasks to
those memory just like other VMA skippings. This is the current
implementation, I think we can improve that in the future, but at the
moment it's probabaly better to keep it consistent.

v2->v3:
1. add enable_cpuset() based on Mel's suggestion but again I think it's
redundant.
2. print out nodemask with %*p.. format in the tracepoint.

v3->v4:
1. fix an unsafe dereference of a pointer to content not on ring buffer,
namely mem_allowed_ptr in the tracepoint.

v4->v5:
1. add BUILD_BUG_ON() in TP_fast_assign() to guard against future
changes (particularly in size) in nodemask_t.

Libo Chen (2):
sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
cpuset.mems
sched/numa: Add tracepoint that tracks the skipping of numa balancing
due to cpuset memory pinning

include/trace/events/sched.h | 33 +++++++++++++++++++++++++++++++++
kernel/sched/fair.c | 9 +++++++++
2 files changed, 42 insertions(+)


Tested on top of next-20250424. The boot warning[1] is fixed with this version.

Tested-by: Srikanth Aithal <sraithal@xxxxxxx>


[1]: https://lore.kernel.org/all/20250422205740.02c4893a@xxxxxxxxxxxxxxxx/