Re: [PATCH 2/2] cgroup/cpuset: Rebind/migrate mm only for threadgroup leader in cpuset_update_tasks_nodemask()
From: Ridong Chen
Date: Mon Jun 22 2026 - 21:25:14 EST
On 6/23/2026 6:45 AM, Waiman Long wrote:
As reported by sashiko [1], cpuset_update_tasks_nodemask() will do
mpol_rebind_mm() and possibly cpuset_migrate_mm() for all threads of
a multithreaded process. Since commit 3df9ca0a2b8b ("cpuset: migrate
memory only for threadgroup leaders"), cpuset_attach() had been updated
to rebind and migrate memory only for threadgroup leaders to mark the
group leader as the owner of the mm_struct.
To be consistent and avoid unnecessary performance overhead for heavily
multithreaded processes, follow the cpuset_attach() example and perform
memory rebind and migration only for threadgroup leaders.
Also add a paragraph in cgroup-v2.rst under cpuset.mems that the
threadgroup leader is the memory owner of that threadgroup. Therefore
the non-leading threads shouldn't be in other cgroups whose "cpuset.mems"
doesn't fully overleap that of the group leader.
[1] https://sashiko.dev/#/patchset/20260621032816.1806773-1-longman%40redhat.com
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
Documentation/admin-guide/cgroup-v2.rst | 7 +++++++
kernel/cgroup/cpuset.c | 4 ++++
2 files changed, 11 insertions(+)
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 993446ab66d0..341037c7ec9d 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2527,6 +2527,13 @@ Cpuset Interface Files
a need to change "cpuset.mems" with active tasks, it shouldn't
be done frequently.
+ For a multithreaded process, the threadgroup leader is
+ considered the owner of the group's memory. Memory policy
+ rebinding and migration will only happen with respect to the
+ threadgroup leader. To avoid unexpected result, non-leading
+ threads shouldn't be put into another cgroup whose "cpuset.mems"
+ doesn't full overleap that of the threadgroup leader.
+
cpuset.mems.effective
A read-only multiple values file which exists on all
cpuset-enabled cgroups.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index bc0207fd6e57..27bc7a466468 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2659,6 +2659,10 @@ void cpuset_update_tasks_nodemask(struct cpuset *cs)
cpuset_change_task_nodemask(task, &newmems);
+ /* Rebind and migrate mm only for task group leader */
+ if (task != task->group_leader)
+ continue;
+
Nit.
if (!thread_group_leader(task))
continue;
mm = get_task_mm(task);
if (!mm)
continue;
Reviewed-by: Ridong Chen <ridong.chen@xxxxxxxxx>
--
Best regards
Ridong