Re: [PATCH 05/32] nohz: Move rcu dynticks idle mode handling to idleenter/exit APIs

From: Frederic Weisbecker
Date: Tue Aug 30 2011 - 11:33:53 EST


On Tue, Aug 30, 2011 at 05:26:33PM +0200, Peter Zijlstra wrote:
> On Tue, 2011-08-30 at 16:32 +0200, Frederic Weisbecker wrote:
> > On Tue, Aug 30, 2011 at 01:21:55PM +0200, Peter Zijlstra wrote:
> > > On Tue, 2011-08-30 at 01:35 +0200, Frederic Weisbecker wrote:
> > > > > That means it has to be in an extended grace period when we stop the
> > > > > tick.
> > > >
> > > > You mean extended quiescent state?
> > >
> > > Yeah that :-)
> > >
> > > > As a summary here is what we do:
> > > >
> > > > - if we are in the kernel, we can't run into extended quiescent state because
> > > > we may make use of rcu anytime there. But if we run nohz we don't have the tick
> > > > to notice quiescent states to the RCU machinery and help completing grace periods
> > > > so as soon as we receive an rcu IPI from another CPU (due to the grace period
> > > > beeing extended because our nohz CPU doesn't report quiescent states), we restart
> > > > the tick. We are optimistic enough to consider that we may avoid a lot of ticks
> > > > even if there are some risks to be disturbed in some random rates.
> > > > So even with the IPI we consider it as an upside.
> > > >
> > > > - if we are in userspace we can run in extended quiescent state.
> > >
> > > But you can only disable the tick/enter extended quiescent state while
> > > in kernel-space. Thus the second clause is precluded from ever being
> > > true.
> >
> > No, we have a specific stacking in the irq:
> >
> > rcu_irq_enter()
> >
> > disable tick...
> > if (user)
> > rcu_enter_nohz();
> >
> > rcu_irq_exit() <-- extended quiescent state entry effective only there
> >
> > And by the time we call rcu_irq_exit() and we resume to userspace, we are
> > not supposed to have rcu read side critical section (minus the case of
> > a signal with do_notify_resume() which I have yet to handle).
>
> See all that is still kernelspace ;-) I think I know what you mean to
> say though, but seeing as you note there is even now a known shortcoming
> I'm not very confident its a solid construction. What will help us find
> such holes?

This: https://lkml.org/lkml/2011/6/23/744

It's in one of Paul's branches and should make it for the next merge window.
This should detect any of such holes. I made that on purpose for the nohz cpusets
when I saw how much error prone that can be with rcu :)

> I would much rather we not rely on such fragile things too much.. this
> RCU stuff wants way more thought, as it stands your patch-set doesn't do
> anything useful IMO.

Not sure what you mean. Well that Rcu thing for sure is fragile but we have
the tools ready to find the problems.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/