Re: mm: deadlock between get_online_cpus/pcpu_alloc

From: Mel Gorman
Date: Tue Feb 07 2017 - 06:14:08 EST


On Tue, Feb 07, 2017 at 10:42:49AM +0000, Mel Gorman wrote:
> On Tue, Feb 07, 2017 at 10:23:31AM +0100, Vlastimil Babka wrote:
> > > cpu offlining. I have to check the code but my impression was that WQ
> > > code will ignore the cpu requested by the work item when the cpu is
> > > going offline. If the offline happens while the worker function already
> > > executes then it has to wait as we run with preemption disabled so we
> > > should be safe here. Or am I missing something obvious?
> >
> > Tejun suggested an alternative solution to avoiding get_online_cpus() in
> > this thread:
> > https://lkml.kernel.org/r/<20170123170329.GA7820@xxxxxxxxxxxxxxx>
>
> But it would look like the following as it could be serialised against
> pcpu_drain_mutex as the cpu hotplug teardown callback is allowed to sleep.
>

Bah, this is obviously unsafe. It's guaranteed to deadlock.

--
Mel Gorman
SUSE Labs