Re: mm: deadlock between get_online_cpus/pcpu_alloc
From: Thomas Gleixner
Date: Tue Feb 07 2017 - 17:34:12 EST
On Mon, 6 Feb 2017, Dmitry Vyukov wrote:
> On Mon, Jan 30, 2017 at 4:48 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> Unfortunately it does not seem to help.
> Fuzzer now runs on 510948533b059f4f5033464f9f4a0c32d4ab0c08 of
> mmotm/auto-latest
> (git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git):
>
> commit 510948533b059f4f5033464f9f4a0c32d4ab0c08
> Date: Thu Feb 2 10:08:47 2017 +0100
> mmotm: userfaultfd-non-cooperative-add-event-for-memory-unmaps-fix
>
> The commit you referenced is already there:
>
> commit 806b158031ca0b4714e775898396529a758ebc2c
> Date: Thu Feb 2 08:53:16 2017 +0100
> mm, page_alloc: use static global work_struct for draining per-cpu pages
<SNIP>
> Chain exists of:
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(pcpu_alloc_mutex);
> lock(cpu_hotplug.lock);
> lock(pcpu_alloc_mutex);
> lock(cpu_hotplug.dep_map);
And that's exactly what happens:
cpu_up()
alloc_percpu() lock(hotplug.lock)
lock(&pcpu_alloc_mutex)
.. alloc_percpu()
drain_all_pages() lock(&pcpu_alloc_mutex)
get_online_cpus()
lock(hotplug.lock)
Classic deadlock, i.e. you _cannot_ call get_online_cpus() while holding
pcpu_alloc_mutex.
Alternatively you can forbid to do per cpu alloc/free while holding
hotplug.lock. I doubt that this will make people happy :)
Thanks,
tglx