Re: [PATCH v7 3/9] cgroup/cpuset: Prevent race between task attach and cpuset state change
From: Ridong Chen
Date: Sun Jun 21 2026 - 22:21:34 EST
On 6/21/2026 11:28 AM, Waiman Long wrote:
Commit e44193d39e8d ("cpuset: let hotplug propagation work wait for
task attaching") was introduced to let hotplug operation to wait
until the completion of task attaching operation. However, it is
still possible that the states of the source or destination cpuset
can be changed between the cpuset_can_attach() call and the subsequent
cpuset_attach()/cpuset_cacnel_attach() call.
As a result, data gathered during cpuset_can_attach() cannot be reliably
used in the subsequent cpuset_attach()/cpuset_cacnel_attach()
call at all. Make the task attach operation more robust
and allow the sharing of data between cpuset_can_attach() and
cpuset_attach()/cpuset_cacnel_attach() by making cpuset_write_resmask()
and cpuset_partition_write() wait for the completion of task attach
and set the attach_in_progress flag in the source cpuset as well.
The comments about validate_change() are no longer valid as it won't
be called at all if an attach operation is in progress. So the comments
can be removed.
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
---
kernel/cgroup/cpuset.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a1c8890d3519..65d095dcada1 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3080,11 +3080,8 @@ static int cpuset_can_attach(struct cgroup_taskset *tset)
cs->dl_bw_cpu = cpu;
out_success:
- /*
- * Mark attach is in progress. This makes validate_change() fail
- * changes which zero cpus/mems_allowed.
- */
cs->attach_in_progress++;
+ oldcs->attach_in_progress++;
I only see oldcs->attach_in_progress being incremented here — the matching decrement seems to land in a later patch. That makes this one unbalanced on its own (the count would leak, and a later write to the source cpuset would block on the new wait_event()), so it's not bisect-safe.
Let's either keep the patch self-contained or fold it into the patch that adds the decrement.
out_unlock:
if (ret)
@@ -3235,10 +3232,19 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
return -EACCES;
buf = strstrip(buf);
+retry:
+ wait_event(cpuset_attach_wq, cs->attach_in_progress == 0);
+
cpuset_full_lock();
if (!is_cpuset_online(cs))
goto out_unlock;
+ /* Don't race with task attach */
+ if (cs->attach_in_progress) {
+ cpuset_full_unlock();
+ goto retry;
+ }
+
trialcs = dup_or_alloc_cpuset(cs);
if (!trialcs) {
retval = -ENOMEM;
@@ -3366,7 +3372,17 @@ static ssize_t cpuset_partition_write(struct kernfs_open_file *of, char *buf,
else
return -EINVAL;
+retry:
+ wait_event(cpuset_attach_wq, cs->attach_in_progress == 0);
+
cpuset_full_lock();
+
+ /* Don't race with task attach */
+ if (cs->attach_in_progress) {
+ cpuset_full_unlock();
+ goto retry;
+ }
+
if (is_cpuset_online(cs))
retval = update_prstate(cs, val);
cpuset_update_sd_hk_unlock();
@@ -3605,10 +3621,6 @@ static int cpuset_can_fork(struct task_struct *task, struct css_set *cset)
if (ret)
goto out_unlock;
- /*
- * Mark attach is in progress. This makes validate_change() fail
- * changes which zero cpus/mems_allowed.
- */
cs->attach_in_progress++;
out_unlock:
mutex_unlock(&cpuset_mutex);
--
Best regards
Ridong