Re: [patch 20/20] rcu: Make CPU_DYING_IDLE an explicit call

From: Paul E. McKenney
Date: Sat Feb 27 2016 - 11:33:19 EST


On Sat, Feb 27, 2016 at 12:30:33PM +0100, Thomas Gleixner wrote:
> On Sat, 27 Feb 2016, Paul E. McKenney wrote:
> > On Sat, Feb 27, 2016 at 08:47:41AM +0100, Thomas Gleixner wrote:
> > > On Fri, 26 Feb 2016, Paul E. McKenney wrote:
> > > > > > --- a/kernel/cpu.c
> > > > > > +++ b/kernel/cpu.c
> > > > > > @@ -762,6 +762,7 @@ void cpuhp_report_idle_dead(void)
> > > > > > BUG_ON(st->state != CPUHP_AP_OFFLINE);
> > > > > > st->state = CPUHP_AP_IDLE_DEAD;
> > > > > > complete(&st->done);
> > > > >
> > > > > What prevents the other CPU from killing this CPU at this point, so
> > > > > that this CPU does not tell RCU that it is dead?
> > > > >
> > > > > I agree that the odds should be low, but there are all manner of things
> > > > > that might delay a CPU for just a little bit too long...
> > > > >
> > > > > Or am I missing something subtle here?
> > >
> > > No. The reason why I moved the rcu call past the complete is, that otherwise
> > > complete() complains about rcu being dead already. Hmm, but you are right. In
> > > theory the other side could allow physical removal before it actually told rcu
> > > that it's gone.
> >
> > There is one case where this is OK, and that is where the outgoing CPU
> > puts itself to sleep (or whatever) without help from the other CPU.
>
> That's the case. It's the last call before the outgoing CPU goes into
> arch_cpu_idle_dead(). There is no involvement of the controlling CPU at this
> point. It just wants to know, that the outgoing one is dead finally.

Ah, so you have gotten rid of all the things like arm's and xtensa's
platform_cpu_kill(), where the surviving CPU does things like stopping
the outgoing CPU's clock? That would make things simpler!

Thanx, Paul