Re: [PATCH 05/32] nohz: Move rcu dynticks idle mode handling to idleenter/exit APIs

From: Frederic Weisbecker
Date: Tue Aug 30 2011 - 10:26:29 EST


On Tue, Aug 30, 2011 at 01:19:18PM +0200, Peter Zijlstra wrote:
> On Tue, 2011-08-30 at 01:35 +0200, Frederic Weisbecker wrote:
> >
> > OTOH it is needed to find non-critical sections when asked to cooperate
> > in a grace period completion. But if no callback have been enqueued on
> > the whole system we are fine.
>
> Its that 'whole system' clause that I have a problem with. It would be
> perfectly fine to have a number of cpus very busy generating rcu
> callbacks, however this should not mean our adaptive nohz cpu should be
> bothered to complete grace periods.
>
> Requiring it to participate in the grace period state machine is a fail,
> plain and simple.

We need those nohz CPUs to participate because they may use read side
critical section themselves. So we need them to delay running grace period
until the end of their running rcu read side critical sections, like any
other CPUs. Otherwise their supposed rcu read side critical section wouldn't
be effective.

Either that or we need to only stop the tick when we are in userspace.
I'm not sure it would be a good idea.

We discussed this problem, I believe the problem mostly resides in rcu sched.
Because finding quiescent states for rcu bh is easy, but rcu sched needs
the tick or context switches. (For rcu preempt I have no idea.)
So for now that's the sanest way we found amongst:

- Having explicit hooks in preempt_disable() and local_irq_restore()
to notice end of rcu sched critical section. So that we don't need the tick
anymore to find quiescent states. But that's going to be costly. And we may
miss some more implicitly non-preemptable code path.

- Rely on context switches only. I believe in practice it should be fine.
But in theory this delays the grace period completion for an unbounded
amount of time.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/