Re: [PATCH tip/core/rcu 11/20] sched,rcu: Make cond_resched() provide RCU quiescent state

From: Paul E. McKenney
Date: Tue Jan 17 2017 - 07:06:18 EST


On Tue, Jan 17, 2017 at 11:51:41AM +0100, Michal Hocko wrote:
> On Mon 16-01-17 16:54:03, Paul E. McKenney wrote:
> > On Mon, Jan 16, 2017 at 06:11:30PM +0100, Peter Zijlstra wrote:
> > > On Sat, Jan 14, 2017 at 01:13:12AM -0800, Paul E. McKenney wrote:
> > > > There is some confusion as to which of cond_resched() or
> > > > cond_resched_rcu_qs() should be added to long in-kernel loops.
> > > > This commit therefore eliminates the decision by adding RCU
> > > > quiescent states to cond_resched().
> > >
> > > Which would make: rcu_read_lock(); cond_resched(); rcu_read_unlock();
> > > invalid under preemptible RCU. Is it already?
> >
> > In theory, yes. In practice, I just tested it with preemption and
> > lockdep enabled, and it didn't complain. If further testing finds
> > complaints, we can either fix those uses (preferred) or revert
> > this patch.
> >
> > > > Warning: This is a prototype. For example, it does not correctly
> > > > handle Tasks RCU. Which is OK for the moment, given that no one
> > > > actually uses Tasks RCU yet.
> > >
> > > > --- a/kernel/sched/core.c
> > > > +++ b/kernel/sched/core.c
> > > > @@ -4907,6 +4907,7 @@ int __sched _cond_resched(void)
> > > > preempt_schedule_common();
> > > > return 1;
> > > > }
> > > > + rcu_all_qs();
> > > > return 0;
> > > > }
> > >
> > > Still not a real fan of this, it does make cond_resched() touch a bunch
> > > more cachelines, also, I suppose that if we're going to do this, we
> > > should make __cond_resched_lock() and __cond_resched_softirq() act
> > > similarly.
> >
> > Michal (now CCed) argues that having to distinguish between cond_resched()
> > and cond_resched_rcu_qs() is overly burdensome. Michal?
>
> Yes, it is really not clear which one is meant to be in which context. I
> really do not see which cond_resched should be turned intto
> cond_resched_rcu_qs.
>
> > Any thoughts on how we might remove this burden without the additional
> > cache misses? I will take another look as well to see what could make
> > it lower cost. There are probably ways... Would it make sense to
> > have RCU maintain a need-rcu_all_qs() flage in the same cacheline as
> > the __preempt_count? Perhaps throttling the writes to this flag from
> > the RCU grace-period kthreads to once per 100 milliseconds or so?
>
> Can the stall detector simply request rescheduling when it gets
> dangerously close to the timeout?

It is quite possible that half of the stall timeout would be a better
choice than my 100 milliseconds, but either way, there would be need
for a flag or some such.

Thanx, Paul