Re: [PATCH 05/32] nohz: Move rcu dynticks idle mode handling to idleenter/exit APIs

From: Frederic Weisbecker
Date: Tue Aug 30 2011 - 18:24:47 EST


On Tue, Aug 30, 2011 at 10:58:38PM +0200, Peter Zijlstra wrote:
> On Tue, 2011-08-30 at 17:42 +0200, Peter Zijlstra wrote:
> > On Tue, 2011-08-30 at 17:33 +0200, Frederic Weisbecker wrote:
> > > > See all that is still kernelspace ;-) I think I know what you mean to
> > > > say though, but seeing as you note there is even now a known shortcoming
> > > > I'm not very confident its a solid construction. What will help us find
> > > > such holes?
> > >
> > > This: https://lkml.org/lkml/2011/6/23/744
> > >
> > > It's in one of Paul's branches and should make it for the next merge window.
> > > This should detect any of such holes. I made that on purpose for the nohz cpusets
> > > when I saw how much error prone that can be with rcu :)
> >
> > OK, good ;-)
> >
> > > > I would much rather we not rely on such fragile things too much.. this
> > > > RCU stuff wants way more thought, as it stands your patch-set doesn't do
> > > > anything useful IMO.
> > >
> > > Not sure what you mean. Well that Rcu thing for sure is fragile but we have
> > > the tools ready to find the problems.
> >
> > Right that thing you linked above does catch abuse, still your current
> > proposal means that due to RCU it will basically never disable the tick.
>
> So how about something like:
>
> Assuming we are in rcu_nohz state; on kernel enter we leave rcu_nohz but
> don't start the tick, instead we assign another cpu to run our state
> machine.

The nohz CPU still has to notice its own quiescent states. Now it could be
an optimization to ask another CPU to handle all the rest once that quiescent
state is found. That doesn't solve our main problem though which is to
reliably report quiescent states when asked for.

> On kernel exit we 'donate' all our rcu state to a willing victim (the
> same that earlier was kind enough to drive our state) and undo our
> entire GP accounting and re-enter rcu_nohz state.

That's already what does rcu_enter_nohz().

> If between that time we did restart the tick, we take back our rcu state
> and skip the donate and rcu_nohz enter on kernel exit.

That's also what is done in this patchset. As soon as we re-enter the kernel
or the tick had to be restarted before we re-enter the kernel, we call
rcu_exit_nohz() that pulls back the CPU to the whole RCU machinery.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/