Re: [PATCH/for-next v4 3/4] cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock
From: Waiman Long
Date: Tue Feb 10 2026 - 09:03:50 EST
On 2/9/26 8:29 PM, Chen Ridong wrote:
On 2026/2/10 4:29, Waiman Long wrote:
On 2/9/26 2:12 AM, Chen Ridong wrote:It is clear about isolated_hk_cpus and isolated_cpus.
It is perfectly possible that isolated_cpus can be modified more than one timereturn;Timeline:
}
- WARN_ON_ONCE(housekeeping_update(isolated_cpus) < 0);
- isolated_cpus_updating = false;
+ /*
+ * update_isolation_cpumasks() may be called more than once in the
+ * same cpuset_mutex critical section.
+ */
+ lockdep_assert_held(&cpuset_top_mutex);
+ if (isolcpus_twork_queued)
+ return;
+
+ init_task_work(&twork_cb, isolcpus_tworkfn);
+ if (!task_work_add(current, &twork_cb, TWA_RESUME))
+ isolcpus_twork_queued = true;
+ else
+ WARN_ON_ONCE(1); /* Current task shouldn't be exiting */
}
user A user B
write isolated cpus write isolated cpus
isolated_cpus_update
update_isolation_cpumasks
task_work_add
isolcpus_twork_queued =true
// before returning userspace
// waiting for worker
isolated_cpus_update
if (isolcpus_twork_queued)
return // Early exit
// return to userspace
// workqueue finishes
// return to userspace
For User B, the isolated_cpus value appears to be set and the syscall returns
successfully to userspace. However, because isolcpus_twork_queued was already
true (set by User A), User B's call skipped the actual mask update
(update_isolation_cpumasks).
Thus, the new isolated_cpus value is not yet effective in the kernel, even
though User B's write operation returned without error.
Is this a valid issue? Should User B's write be blocked?
from different tasks before a work or task_work function is executed. When that
function is invoked, isolated_cpus should contain changes for both. It will copy
isolated_cpus to isolated_hk_cpus and pass it to housekeeping_update(). When the
2nd work or task_work function is invoked, it will see that isolated_cpus matchThe main question remains: user B receives a success return even though
isolated_hk_cpus and skip the housekeeping_update() action. There is no need to
block user B's write as only one task can update isolated_cpus at any time.
isolated_hk_cpus has not yet taken effect (i.e.,
/sys/devices/system/cpu/isolated does not reflect the change). In that case, how
can user B confirm whether their configuration is actually applied?
task_work function is synchronous. IOW, if a user writes to a cpuset control file to modify an isolated partition, when control is passed back to userspace, it is guaranteed that the task_work function, if queued, would have been executed.
wq work function, OTOH, is asynchronous. So if a user brings down an isolated CPU to make an isolated partition invalid, the supposed changes to the sched domains may not be completed by the time the offline operation returns. However this is an operation that normal users shouldn't do in a production system anyway and they are taking their own risk if they try to do it.
Cheers,
Longman