Re: mm: deadlock between get_online_cpus/pcpu_alloc

From: Mel Gorman
Date: Tue Feb 07 2017 - 05:28:17 EST


On Tue, Feb 07, 2017 at 10:49:28AM +0100, Vlastimil Babka wrote:
> On 02/07/2017 10:43 AM, Mel Gorman wrote:
> > If I'm reading this right, a hot-remove will set the pool POOL_DISASSOCIATED
> > and unbound. A workqueue queued for draining get migrated during hot-remove
> > and a drain operation will execute twice on a CPU -- one for what was
> > queued and a second time for the CPU it was migrated from. It should still
> > work with flush_work which doesn't appear to block forever if an item
> > got migrated to another workqueue. The actual drain workqueue function is
> > using the CPU ID it's currently running on so it shouldn't get confused.
>
> Is the worker that will process this migrated workqueue also guaranteed
> to be pinned to a cpu for the whole work, though? drain_local_pages()
> needs that guarantee.
>

It should be by running on a workqueue handler bound to that CPU (queued
on wq->cpu_pwqs in __queue_work)

--
Mel Gorman
SUSE Labs