Re: [PATCH-next v5 6/6] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
From: Michal Koutný
Date: Wed Jun 24 2026 - 11:45:35 EST
Hello Waiman.
On Mon, Jun 01, 2026 at 10:32:03PM -0400, Waiman Long <longman@xxxxxxxxxx> wrote:
> This problem is less an issue when enabling the cpuset controller as all
> the newly created child cpusets will have exactly the same set of CPUs
> and memory nodes except when deadline tasks are involved in migration
> as the deadline task accounting data can be off.
>
> It can be more problematic when the cpuset controller is disabled as
> their set of CPUs and memory nodes may differ from their parent or with
> the moving of multi-threaded process from different threaded cgroups.
When I generalize that it can be an issue for any threaded controller
that somehow relies on the _difference_ between old and new thread
membership.
So I checked some: pids and perf_events look alright (no
diff-dependency) but I noticed the very same issue is tackled in
sched_change_group/scx_cgroup_move_task and that there is a member
inside task_struct allocated for this state tracking already:
task_struct::scx::cgrp_moving_from
> Fix that by tracking the set of source (old) and destination cpusets
> in singly linked lists and iterating them all to properly update the
> internal data. Also keep the current cs and oldcs variables up-to-date
> with the css and task iterators.
So there would be more than a single use for something conceptually
like:
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 004e6d56a499a..740c02f220c75 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1326,6 +1326,9 @@ struct task_struct {
#ifdef CONFIG_PREEMPT_RT
struct llist_node cg_dead_lnode;
#endif /* CONFIG_PREEMPT_RT */
+#ifdef CONFIG_CGROUPS_MOVING_FROM
+ struct cgroup *cgrp_moving_from;
+#endif
#endif /* CONFIG_CGROUPS */
#ifdef CONFIG_X86_CPU_RESCTRL
u32 closid;
diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
index 1a3af2ea2a794..5b63afe83f333 100644
--- a/include/linux/sched/ext.h
+++ b/include/linux/sched/ext.h
@@ -240,9 +240,6 @@ struct sched_ext_entity {
bool disallow; /* reject switching into SCX */
/* cold fields */
-#ifdef CONFIG_EXT_GROUP_SCHED
- struct cgroup *cgrp_moving_from;
-#endif
struct list_head tasks_node;
};
diff --git a/init/Kconfig b/init/Kconfig
index 2937c4d308aec..d7e7d4477f862 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1186,6 +1186,7 @@ config EXT_GROUP_SCHED
depends on SCHED_CLASS_EXT && CGROUP_SCHED
select GROUP_SCHED_WEIGHT
select GROUP_SCHED_BANDWIDTH
+ select CGROUPS_MOVING_FROM
default y
endif #CGROUP_SCHED
@@ -1288,6 +1289,7 @@ config CPUSETS
depends on SMP
select UNION_FIND
select CPU_ISOLATION
+ select CGROUPS_MOVING_FROM
help
This option will let you create and manage CPUSETs which
allow dynamically partitioning a system into sets of CPUs and
I think this could simplify the before-after state tracking generally,
WDYT?
Michal
Attachment:
signature.asc
Description: PGP signature