Re: [PATCH] sched/numa: Add statistics of numa balance task migration and swap

From: Chen, Yu C
Date: Wed Apr 02 2025 - 22:54:37 EST


Hi Michal,

thanks for taking a look at this,

On 4/2/2025 9:24 PM, Michal Koutný wrote:
Hello Chen.

On Wed, Apr 02, 2025 at 09:06:11AM +0800, Chen Yu <yu.c.chen@xxxxxxxxx> wrote:
On system with NUMA balancing enabled, it is found that tracking
the task activities due to NUMA balancing is helpful.
...
The following two new fields:

numa_task_migrated
numa_task_swapped

will be displayed in both
/sys/fs/cgroup/{GROUP}/memory.stat and /proc/{PID}/sched

Why is the field /proc/$pid/sched not enough?


In the context of NUMA balancing, it would be helpful to not only monitor on the activities of individual task/thread but also the resource usage and task migrations at the group level - which helps us quickly evaluate the performance and resource usage of the container - like per memcg numa_pages_migrated, numa_pte_updates introduced in
commit f77f0c751478 ("mm,memcg: provide per-cgroup counters for NUMA balancing operations"). Yes, we can iterate the /proc/$pid/sched to
find the accumulated NUMA stat, and the introduction of per - cgroup numa stat can help users more conveniently track the overall data of the workload.

Besides, I'm considering evaluating the per - cgroup NUMA balance control[1] to help users do fine - grain control per workload. This per - cgroup NUMA balance stat could be used to evaluate the efficiency of per - cgroup NUMA balance.

Also, you may want to update Documentation/admin-guide/cgroup-v2.rst
too.

Got it, will do in next version.

[1] https://lore.kernel.org/lkml/b3f1f6c478127a38b9091a8341374ba160d25c5a.1740483690.git.yu.c.chen@xxxxxxxxx/

thanks,
Chenyu


Thanks,
Michal