Re: [PATCH -next] cpuset: treate root invalid trialcs as exclusive
From: Waiman Long
Date: Mon Nov 17 2025 - 10:56:36 EST
On 11/15/25 4:31 AM, Chen Ridong wrote:
From: Chen Ridong <chenridong@xxxxxxxxxx>
A test scenario revealed inconsistent results based on operation order:
Scenario 1:
#cd /sys/fs/cgroup/
#mkdir A1
#mkdir B1
#echo 1-2 > B1/cpuset.cpus
#echo 0-1 > A1/cpuset.cpus
#echo root > A1/cpuset.cpus.partition
#cat A1/cpuset.cpus.partition
root invalid (Cpu list in cpuset.cpus not exclusive)
Scenario 2:
#cd /sys/fs/cgroup/
#mkdir A1
#mkdir B1
#echo 1-2 > B1/cpuset.cpus
#echo root > A1/cpuset.cpus.partition
#echo 0-1 > A1/cpuset.cpus
#cat A1/cpuset.cpus.partition
root
The second scenario produces an unexpected result: A1 should be marked
as invalid but is incorrectly recognized as valid. This occurs because
when validate_change is invoked, A1 (in root-invalid state) may
automatically transition to a valid partition, with non-exclusive state
checks against siblings, leading to incorrect validation.
To fix this inconsistency, treat trialcs in root-invalid state as exclusive
during validation and set the corresponding exclusive flags, ensuring
consistent behavior regardless of operation order.
Signed-off-by: Chen Ridong <chenridong@xxxxxxxxxx>
---
kernel/cgroup/cpuset.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index daf813386260..a189f356b5f1 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2526,6 +2526,18 @@ static void partition_cpus_change(struct cpuset *cs, struct cpuset *trialcs,
}
}
+static int init_trialcs(struct cpuset *cs, struct cpuset *trialcs)
+{
+ trialcs->prs_err = PERR_NONE;
+ /*
+ * If partition_root_state != 0, it may automatically change to a partition,
+ * Therefore, we should treat trialcs as exclusive during validation
+ */
+ if (trialcs->partition_root_state)
+ set_bit(CS_CPU_EXCLUSIVE, &trialcs->flags);
Nit: We usually use the non-atomic version __set_bit() if concurrent
access isn't possible which is true in this case.
+ return compute_trialcs_excpus(trialcs, cs);
+}
+
/**
* update_cpumask - update the cpus_allowed mask of a cpuset and all tasks in it
* @cs: the cpuset to consider
@@ -2551,9 +2563,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (alloc_tmpmasks(&tmp))
return -ENOMEM;
- compute_trialcs_excpus(trialcs, cs);
- trialcs->prs_err = PERR_NONE;
-
+ init_trialcs(cs, trialcs);
retval = cpus_allowed_validate_change(cs, trialcs, &tmp);
if (retval < 0)
goto out_free;
@@ -2612,7 +2622,7 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
* Reject the change if there is exclusive CPUs conflict with
* the siblings.
*/
- if (compute_trialcs_excpus(trialcs, cs))
+ if (init_trialcs(cs, trialcs))
return -EINVAL;
/*
@@ -2628,7 +2638,6 @@ static int update_exclusive_cpumask(struct cpuset *cs, struct cpuset *trialcs,
if (alloc_tmpmasks(&tmp))
return -ENOMEM;
- trialcs->prs_err = PERR_NONE;
partition_cpus_change(cs, trialcs, &tmp);
spin_lock_irq(&callback_lock);
Acked-by: Waiman Long <longman@xxxxxxxxxx>