Re: [PATCH] workqueue: Fix pool->nr_running type back to atomic
From: Tejun Heo
Date: Tue Feb 06 2024 - 11:52:51 EST
Hello,
On Tue, Feb 06, 2024 at 04:00:24PM +0800, Yunlong Xing wrote:
> In CPU-hotplug test, when plug the core, set_cpus_allowed_ptr() restoring
> the cpus_mask of the per-cpu worker may fail, the cpus_mask of the worker
> remain wq_unbound_cpumask until the core hotpluged next time. so, workers
> in the same per-cpu pool can run concurrently and change nr_running at the
> same time, atomic problem occur.
How would set_cpus_allowed_ptr() fail? That should trigger WARN_ON, right?
If set_cpus_allowed_ptr() fails, nr_running getting desynchronized is only a
part of the problem. We will end up running per-cpu work items which must
execute on the same CPU on foreign CPUs.
Thanks.
--
tejun