[PATCH v2 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask()

From: Waiman Long

Date: Tue Jun 23 2026 - 19:06:37 EST


As reported by sashiko [1], cpuset_update_tasks_nodemask() will do
mpol_rebind_mm() and possibly cpuset_migrate_mm() for all threads of
a multithreaded process. Since commit 3df9ca0a2b8b ("cpuset: migrate
memory only for threadgroup leaders"), cpuset_attach() had been updated
to rebind and migrate memory only for threadgroup leaders to mark the
group leader as the owner of the mm_struct.

To be consistent and avoid unnecessary performance overhead for heavily
multithreaded processes, follow the cpuset_attach() example and perform
memory rebind and migration only for threadgroup leaders.

Also add a paragraph in cgroup-v2.rst under cpuset.mems that the
threadgroup leader is the memory owner of that threadgroup. Therefore
the non-leading threads shouldn't be in other cgroups whose "cpuset.mems"
doesn't fully overlap that of the group leader.

[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Reviewed-by: Ridong Chen <ridong.chen@xxxxxxxxx>
---
Documentation/admin-guide/cgroup-v2.rst | 7 +++++++
kernel/cgroup/cpuset.c | 4 ++++
2 files changed, 11 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 993446ab66d0..f9c353174a7e 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2527,6 +2527,13 @@ Cpuset Interface Files
a need to change "cpuset.mems" with active tasks, it shouldn't
be done frequently.

+ For a multithreaded process, the threadgroup leader is
+ considered the owner of the group's memory. Memory policy
+ rebinding and migration will only happen with respect to the
+ threadgroup leader. To avoid unexpected result, non-leading
+ threads shouldn't be put into another cgroup whose "cpuset.mems"
+ doesn't fully overlap that of the threadgroup leader.
+
cpuset.mems.effective
A read-only multiple values file which exists on all
cpuset-enabled cgroups.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 044ddbf66f8e..055ae54a040a 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2673,6 +2673,10 @@ void cpuset_update_tasks_nodemask(struct cpuset *cs)

cpuset_change_task_nodemask(task, &newmems);

+ /* Rebind and migrate mm only for thread group leader */
+ if (!thread_group_leader(task))
+ continue;
+
mm = get_task_mm(task);
if (!mm)
continue;
--
2.54.0