Re: [PATCH v2 tip/core/rcu 01/22] smpboot: Add common code for notification from dying CPU
From: Paul E. McKenney
Date: Tue Mar 17 2015 - 12:35:17 EST
On Tue, Mar 17, 2015 at 03:08:46PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 17, 2015 at 04:36:48AM -0700, Paul E. McKenney wrote:
> > On Tue, Mar 17, 2015 at 09:18:07AM +0100, Peter Zijlstra wrote:
> > > On Mon, Mar 16, 2015 at 11:37:45AM -0700, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > > >
> > > > RCU ignores offlined CPUs, so they cannot safely run RCU read-side code.
> > > > (They -can- use SRCU, but not RCU.) This means that any use of RCU
> > > > during or after the call to arch_cpu_idle_dead(). Unfortunately,
> > > > commit 2ed53c0d6cc99 added a complete() call, which will contain RCU
> > > > read-side critical sections if there is a task waiting to be awakened.
> > >
> > > Got a little more detail there?
> >
> > Quite possibly. But exactly what sort of detail are you looking for?
>
> What exact RCU usage you ran into that was problematic. It seems to
> imply that calling complete() -- from a dead cpu -- which ends up in
> try_to_wake_up() was the problem?
Yep, that was the one. At that point, the CPU can disappear without
any chance to tell RCU anything, so RCU has to have started ignoring
it beforehand. This bug has existed for a long time, masked by RCU's
waiting a jiffy before ignoring already-offline CPUs. Which would be a
problem if the CPU took longer than one jiffy to get from stop_machine()
to arch_cpu_idle_dead(). Which could actually, happen, especially
in a guest OS.
In addition, any tracing or printk()s on that code path (for example,
via lockdep) can also result in RCU read-side critical sections from an
offline CPU that RCU is ignoring.
So you would like me to pull this info into the commit log? Easy to
do if so.
Or am I missing your point?
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/