Re: [PATCH] cpu/hotplug: Serialize callback invocations proper

From: Bart Van Assche
Date: Tue Mar 14 2017 - 13:38:28 EST


On Tue, 2017-03-14 at 16:06 +0100, Sebastian Andrzej Siewior wrote:
> The setup/remove_state/instance() functions in the hotplug core code are
> serialized against concurrent CPU hotplug, but unfortunately not serialized
> against themself.
>
> As a consequence a concurrent invocation of these function results in
> corruption of the callback machinery because two instances try to invoke
> callbacks on remote cpus at the same time. This results in missing callback
> invocations and initiator threads waiting forever on the completion.
>
> The obvious solution to replace get_cpu_online() with cpu_hotplug_begin()
> is not possible because at least one callsite calls into these functions
> from a get_online_cpu() locked region.
>
> Extend the protection scope of the cpuhp_state_mutex from solely protecting
> the state arrays to cover the callback invocation machinery as well.
>
> Reported-by: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>
> Fixes: 5b7aa87e0482 ("cpu/hotplug: Implement setup/removal interface")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>

Tested-by: Bart Van Assche <Bart.VanAssche@xxxxxxxxxxx>

So this regression was introduced in kernel v4.6? Anyway, thanks for the patch!

Bart.