Re: [PATCH] workqueue: Ensure that cpumask set for pools created after boot

From: Michael Bringmann
Date: Mon May 15 2017 - 11:48:31 EST


Hello:

On 05/10/2017 12:33 PM, Tejun Heo wrote:
> Hello,
>
> On Wed, May 10, 2017 at 11:48:17AM -0500, Michael Bringmann wrote:
>>
>> On NUMA systems with dynamic processors, the content of the cpumask
>> may change over time. As new processors are added via DLPAR operations,
>> workqueues are created for them. This patch ensures that the pools
>> created for new workqueues will be initialized with a cpumask before
>> the first worker is created, attached, and woken up. If the mask is
>> not set up, then the kernel will crash when 'wakeup_process' is unable
>> to find a valid CPU to which to assign the new worker.
>>
>> Signed-off-by: Michael Bringmann <mwb@xxxxxxxxxxxxxxxxxx>
>> ---
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index c74bf39..6091069 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -3366,6 +3366,8 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
>> copy_workqueue_attrs(pool->attrs, attrs);
>> pool->node = target_node;
>>
>> + cpumask_copy(pool->attrs->cpumask, cpumask_of(smp_processor_id()));
>
> What prevents a cpu getting added right here tho?

PowerPC has only one control path to add/remove CPUs via DLPAR operations.
Even so, the underlying code is protected through multiple locks.

>
> Maybe the right thing to do is protecting the whole thing with hotplug
> readlock?

The operation is already within a hotplug readlock when performing DLPAR
add/remove. Adding a CPU to the system, requires it to be brought online.
Removing a CPU from the system, requires it to be taken offline. These
involve calls to cpu_up / cpu_down, which go through _cpu_up / _cpu_down,
which acquire the hotplug locks, among others along the path of execution.

The locks are acquired before getting to the workqueue code, the pool
creation/attachment code (which is where the cpu mask needs to be set),
or trying to wakeup the initial created task in 'sched.c'.

>
> Thanks.
>

Regards,
Michael

--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb@xxxxxxxxxxxxxxxxxx