Re: [PATCH RFC tip/core/rcu] Eliminate deadlock between CPU hotplug and expedited grace periods

From: Rafael J. Wysocki
Date: Thu Sep 18 2014 - 18:35:37 EST


On Thursday, September 18, 2014 05:38:45 AM Paul E. McKenney wrote:
> On Thu, Sep 18, 2014 at 03:15:36PM +0800, Lan Tianyu wrote:
> > On 2014å09æ17æ 21:10, Paul E. McKenney wrote:
> > > On Wed, Sep 17, 2014 at 03:11:42PM +0800, Lan Tianyu wrote:
> > >> On 2014å08æ29æ 03:47, Paul E. McKenney wrote:
> > >>> Currently, the expedited grace-period primitives do get_online_cpus().
> > >>> This greatly simplifies their implementation, but means that calls to
> > >>> them holding locks that are acquired by CPU-hotplug notifiers (to say
> > >>> nothing of calls to these primitives from CPU-hotplug notifiers) can
> > >>> deadlock. But this is starting to become inconvenient:
> > >>> https://lkml.org/lkml/2014/8/5/754
> > >>>
> > >>> This commit avoids the deadlock and retains the simplicity by creating
> > >>> a try_get_online_cpus(), which returns false if the get_online_cpus()
> > >>> reference count could not immediately be incremented. If a call to
> > >>> try_get_online_cpus() returns true, the expedited primitives operate
> > >>> as before. If a call returns false, the expedited primitives fall back
> > >>> to normal grace-period operations. This falling back of course results
> > >>> in increased grace-period latency, but only during times when CPU
> > >>> hotplug operations are actually in flight. The effect should therefore
> > >>> be negligible during normal operation.
> > >>>
> > >>> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > >>> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > >>> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> > >>> Cc: Lan Tianyu <tianyu.lan@xxxxxxxxx>
> > >>
> > >> Hi Paul:
> > >> What's the status of the patch? Will you push it? Thanks.
> > >
> > > By default, it would go into 3.19. Do you need it earlier?
> >
> > IMO, this is a dead lock bug which is hard to reproduce and the patch
> > should go into v3.17 and stable tree?
>
> The problem with pushing for v3.17 is that I would have to rebase
> that commit to the bottom of my current stack and redo all my testing.
> If there were any problems, I could not only miss v3.17, but also miss
> the v3.18 merge window.
>
> So, given that the next merge window happens pretty soon, how about
> v3.18 and the stable tree?

That sounds good to me.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/