Re: [PATCH v3 1/9] smp: Run functions concurrently in smp_call_function_many()

From: Nadav Amit
Date: Mon Jul 22 2019 - 15:02:41 EST


> On Jul 22, 2019, at 11:51 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Mon, 22 Jul 2019, Nadav Amit wrote:
>>> On Jul 22, 2019, at 11:37 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>>
>>> On Mon, 22 Jul 2019, Peter Zijlstra wrote:
>>>
>>>> On Thu, Jul 18, 2019 at 05:58:29PM -0700, Nadav Amit wrote:
>>>>> +/*
>>>>> + * Call a function on all processors. May be used during early boot while
>>>>> + * early_boot_irqs_disabled is set.
>>>>> + */
>>>>> +static inline void on_each_cpu(smp_call_func_t func, void *info, int wait)
>>>>> +{
>>>>> + on_each_cpu_mask(cpu_online_mask, func, info, wait);
>>>>> +}
>>>>
>>>> I'm thinking that one if buggy, nothing protects online mask here.
>>>
>>> The current implementation has preemption disabled before touching
>>> cpu_online_mask which at least protects against a CPU going away as that
>>> prevents the stomp machine thread from getting on the CPU. But it's not
>>> protected against a CPU coming online concurrently.
>>
>> I still donât understand. If you called cpu_online_mask() and did not
>> disable preemption before calling it, you are already (today) not protected
>> against another CPU coming online. Disabling preemption in on_each_cpu()
>> will not solve it.
>
> Disabling preemption _cannot_ protect against a CPU coming online. It only
> can protect against a CPU being offlined.
>
> The current implementation of on_each_cpu() disables preemption _before_
> touching cpu_online_mask.
>
> void on_each_cpu(void (*func) (void *info), void *info, int wait)
> {
> unsigned long flags;
>
> preempt_disable();
> smp_call_function(func, info, wait);
>
> smp_call_function() has another preempt_disable as it can be called
> separately and it does:
>
> preempt_disable();
> smp_call_function_many(cpu_online_mask, func, info, wait);
>
> Your new on_each_cpu() implementation does not. So there is a
> difference. Whether it matters or not is a different question, but that
> needs to be explained and documented.

Thanks for explaining - so your concern is for CPUs being offlined.

But unless I am missing something: on_each_cpu() calls __on_each_cpu_mask(),
which disables preemption and calls __smp_call_function_many().

Then __smp_call_function_many() runs:

cpumask_and(cfd->cpumask, mask, cpu_online_mask);

â before choosing which remote CPUs should run the function. So the only
case that I was missing is if the current CPU goes away and the function is
called locally.

Can it happen? I can add documentation and a debug assertion for this case.