Re: [PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
From: Ridong Chen
Date: Wed Jun 03 2026 - 06:13:00 EST
On 2026/5/27 4:12, Waiman Long wrote:
>
> On 5/20/26 4:29 AM, Ridong Chen wrote:
>>
>>
>> On 2026/5/16 12:24, Waiman Long wrote:
>>> Sashiko AI review of another cpuset patch had found that cpuset_attach()
>>> and cpuset_can_attach() can be passed a cgroup_taskset with tasks
>>> migrating from one source cpuset to multiple destination cpusets and
>>> vice versa. Further testing of the cpuset code indicates that this is
>>> indeed the case when the v2 cpuset controller is enabled or disabled.
>>>
>>> Unfortunately, cpuset_attach() and cpuset_can_attach() still assume that
>>> there will be one source and one destinaton cpuset which may result in
>>> inocrrect behavior.
>>>
>>
>> Hi Longman,
>>
>> I am thinking whether we can use the pids subsystem's approach to
>> solve this issue, which I think could be much simpler.
>>
>> For the DL task accounting, we can handle it the same way
>> pids_can_attach() does - just call task_cs(task) for each task
>> individually inside the can_attach() loop and do the nr_deadline_tasks
>> adjustment right there. This eliminates the need to pass per-task
>> source cpuset information to the attach() callback entirely for DL
>> accounting purposes.
> DL task accounting doesn't use the new oldcs stored in the task
> structure which is only used for mm migration. BTW, I believe
> task_cs(task) doesn't return the old cs in cpuset_attach().
Sorry for the late response.
If I understand correctly, for DL task accounting, we need to know the
destination cpuset to allocate bandwidth. The destination cpuset can be
obtained in cpuset_can_attach.
You are right that task_cs(task) does not return the old cpuset in
cpuset_attach(). But do we really need the old cpuset in cpuset_attach?
Is cpuset_attach_old_cs sufficient for mm migration?
>>
>> For cpuset_migrate_mm(), I don't think we need per-task oldcs storage
>> in task_struct either. The scenarios where multiple source cpusets are
>> involved are:
>>
>> enable cpuset controller: child cpusets inherit parent's
>> effective_mems, so attach_mems_updated is false and
>> cpuset_migrate_mm() is never called.
>>
>> disable cpuset controller: tasks move from children to parent. Since
>> children's effective_mems is always a subset of parent's
>> effective_mems, even if cpuset_migrate_mm() is triggered, it's
>> effectively a noop (no pages need to move from a subset to its superset).
>>
>> cgroup.procs write with threads in different cpusets: this is a
>> many-to-one migration with a single process, so there is only one
>> group_leader and one mm. We only need to record the leader's oldcs,
>> which a single static variable can handle.
>>
>> So in all cases, the migration path only needs one oldcs for the
>> leader. We don't need to add a field to task_struct.
>>
>> What do you think?
>
> Yes, that makes sense. I will rework the patch series.
>
> Thanks,
> Longman
>
>>
>>
>>
>>> This patch series is created to fix this issue. The first 2 patches are
>>> just preparatory patches to make the remaining patches easier to review.
>>>
>>> Patch 3 adds a new attach_old_cs field into task_struct to track the
>>> old cpuset to be used in case when cpuset_migrate_mm() needs to be
>>> called in cpuset_attach().
>>>
>>> Patch 4 moves mpol_rebind_mm() and cpuset_migrate_mm() inside
>>> cpuset_attach_task() to make CLONE_INTO_CGROUP flag of clone(2) works
>>> more like moving task from one cpuset to another one, while also make
>>> supporting multiple source and destination cpusets easier.
>>>
>>> Patch 5 makes the necessary changes to enable the support of multiple
>>> source and destination cpusets by keeping all the source and destination
>>> cpusets found during task iterations in two singly linked lists for
>>> source and destination cpusets respectively.
>>>
>>> Waiman Long (5):
>>> cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper
>>> cgroup/cpuset: Expand the scope of cpuset_can_attach_check()
>>> cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs
>>> field in task_struct
>>> cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside
>>> cpuset_attach_task()
>>> cgroup/cpuset: Support multiple source/destination cpusets for
>>> cpuset_*attach()
>>>
>>> include/linux/sched.h | 3 +
>>> kernel/cgroup/cpuset-internal.h | 6 +
>>> kernel/cgroup/cpuset.c | 358 +++++++++++++++++++++-----------
>>> 3 files changed, 249 insertions(+), 118 deletions(-)
>>>
>>
>
>
--
Best regards,
Ridong