Re: [PATCH -next] cpuset: Remove unnecessary checks in rebuild_sched_domains_locked

From: Waiman Long
Date: Tue Nov 25 2025 - 13:16:54 EST


On 11/18/25 3:36 AM, Chen Ridong wrote:
From: Chen Ridong <chenridong@xxxxxxxxxx>

Commit 406100f3da08 ("cpuset: fix race between hotplug work and later CPU
offline")added a check for empty effective_cpus in partitions for cgroup
v2. However, thischeck did not account for remote partitions, which were
introduced later.

After commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug processing
synchronous"),cgroup v2's cpuset hotplug handling is now synchronous. This
eliminates the race condition with subsequent CPU offline operations that
the original check aimed to fix.
That is true. The original asynchronous cpuset_hotplug_workfn() is called after the hotplug operation finishes. So cpuset can be in a state where cpu_active_mask was updated, but not the effective cpumasks in cpuset.

Instead of extending the check to support remote partitions, this patch
removes the redundant partition effective_cpus check. Additionally, it adds
a check and warningto verify that all generated sched domains consist of
"warningto" => "warning to"
active CPUs, preventing partition_sched_domains from being invoked with
offline CPUs.

Signed-off-by: Chen Ridong <chenridong@xxxxxxxxxx>
---
kernel/cgroup/cpuset.c | 29 ++++++-----------------------
1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index daf813386260..1ac58e3f26b4 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1084,11 +1084,10 @@ void dl_rebuild_rd_accounting(void)
*/
void rebuild_sched_domains_locked(void)
{
- struct cgroup_subsys_state *pos_css;
struct sched_domain_attr *attr;
cpumask_var_t *doms;
- struct cpuset *cs;
int ndoms;
+ int i;
lockdep_assert_cpus_held();
lockdep_assert_held(&cpuset_mutex);

In fact, the following code and the comments above in rebuild_sched_domains_locked() are also no longer relevant. So you may remove them as well.

        if (!top_cpuset.nr_subparts_cpus &&
            !cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
                return;

@@ -1107,30 +1106,14 @@ void rebuild_sched_domains_locked(void)
!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask))
return;
- /*
- * With subpartition CPUs, however, the effective CPUs of a partition
- * root should be only a subset of the active CPUs. Since a CPU in any
- * partition root could be offlined, all must be checked.
- */
- if (!cpumask_empty(subpartitions_cpus)) {
- rcu_read_lock();
- cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) {
- if (!is_partition_valid(cs)) {
- pos_css = css_rightmost_descendant(pos_css);
- continue;
- }
- if (!cpumask_subset(cs->effective_cpus,
- cpu_active_mask)) {
- rcu_read_unlock();
- return;
- }
- }
- rcu_read_unlock();
- }
-
/* Generate domain masks and attrs */
ndoms = generate_sched_domains(&doms, &attr);
+ for (i = 0; i < ndoms; ++i) {
+ if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
+ return;
+ }
+

If it is not clear about the purpose of the WARN_ON_ONCE() call, we should add a comment to explain that cpu_active_mask will not be out of sync with cpuset's effective cpumasks. So the warning should not be triggered.

Cheers,
Longman

/* Have scheduler rebuild the domains */
partition_sched_domains(ndoms, doms, attr);
}