Re: [PATCH -mm] cpuhotplug: introduce try_get_online_cpus() take 3

From: Paul E. McKenney
Date: Tue Jun 09 2009 - 19:48:13 EST


On Tue, Jun 09, 2009 at 12:34:38PM -0700, Andrew Morton wrote:
> On Tue, 09 Jun 2009 20:07:09 +0800
> Lai Jiangshan <laijs@xxxxxxxxxxxxxx> wrote:
>
> > get_online_cpus() is a typically coarsely granular lock.
> > It's a source of ABBA deadlock.
> >
> > Thanks to the CPU notifiers, Some subsystem's global lock will
> > be required after cpu_hotplug.lock. Subsystem's global lock
> > is coarsely granular lock too, thus a lot's of lock in kernel
> > should be required after cpu_hotplug.lock(if we need
> > cpu_hotplug.lock held too)
> >
> > Otherwise it may come to a ABBA deadlock like this:
> >
> > thread 1 | thread 2
> > _cpu_down() | Lock a-kernel-lock.
> > cpu_hotplug_begin() |
> > down_write(&cpu_hotplug.lock) |
> > __raw_notifier_call_chain(CPU_DOWN_PREPARE) | get_online_cpus()
> > ------------------------------------------------------------------------
> > Lock a-kernel-lock.(wait thread2) | down_read(&cpu_hotplug.lock)
> > (wait thread 1)
>
> Confused. cpu_hotplug_begin() doesn't do
> down_write(&cpu_hotplug.lock). If it _were_ to do that then yes, we'd
> be vulnerable to the above deadlock.

The current implementation is a bit more complex. If you hold a kernel
mutex across get_online_cpus() and also acquire that same kernel mutex
in a hotplug notifier that permits sleeping, I believe that you really
can get a deadlock as follows:

Task 1 | Task 2
| mutex_lock(&mylock);
cpu_hotplug_begin() |
mutex_lock(&cpu_hotplug.lock); |
[assume cpu_hotplug.refcount == 0] | get_online_cpus()
---------------------------------------------------------------------------
mutex_lock(&mylock); | mutex_lock(&cpu_hotplug.lock);


That said, when I look at the raw_notifier_call_chain() and
unregister_cpu_notifier() code paths, it is not obvious to me that they
exclude each other or otherwise protect the cpu_chain list...

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/