Re: [PATCH tip/core/rcu 06/10] trace: Eliminate cond_resched_rcu_qs() in favor of cond_resched()
From: Paul E. McKenney
Date: Thu Mar 01 2018 - 15:48:56 EST
On Thu, Mar 01, 2018 at 12:04:04AM -0500, Steven Rostedt wrote:
> On Wed, 28 Feb 2018 17:21:44 -0800
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > > Perhaps, still think this is a special case. That said, perhaps
> > > cond_resched isn't done in critical locations as it's a place that is
> > > explicitly stating that it's OK to schedule.
> >
> > Building on your second sentence, when you are running a non-production
> > stress test, adding an extra function call and conditional branch to
> > cond_resched() should not be a problem.
> >
> > So how about the (still untested) patch below?
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit e9a6ea9fc2542459f9a63cf2b3a0264d09fbc266
> > Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > Date: Sun Feb 25 10:40:44 2018 -0800
> >
> > EXP sched: Make non-production PREEMPT cond_resched() help Tasks RCU
> >
> > In CONFIG_PREEMPT=y kernels, cond_resched() is a complete no-op, and
> > thus cannot help advance Tasks-RCU grace periods. However, such grace
> > periods are only an issue in non-production benchmarking runs of the
> > Linux kernel. This commit therefore makes cond_resched() invoke
> > rcu_note_voluntary_context_switch() for kernels implementing Tasks RCU
> > even in CONFIG_PREEMPT=y kernels.
> >
> > Reported-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index b161ef8a902e..970dadefb86f 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1589,6 +1589,12 @@ static inline int test_tsk_need_resched(struct task_struct *tsk)
> > */
> > #ifndef CONFIG_PREEMPT
> > extern int _cond_resched(void);
> > +#elif defined(CONFIG_TRACEPOINT_BENCHMARK)
> > +static inline int _cond_resched(void)
> > +{
> > + rcu_note_voluntary_context_switch(current);
>
> The thing I hate about this is that it is invasive to code outside of
> the tracepoint benchmark. Why do the rcu_note_voluntary_context_switch
> here and not in the tracepoint code? Seems odd to have it called
> everywhere in the kernel when it is only needed by the benchmark
> tracepoint code.
Understood, and I am not completely devoid of sympathy for that view.
My problem with adding rcu_note_voluntary_context_switch() is that it
is a pretty deep detail of RCU.
Hmmm... I wasn't happy with your original use of cond_resched_rcu_qs()
because it is now a rather strange thing. However, this discussion has
helped me to understand that its real distinction over cond_resched()
as things stand now is that is provides a quiescent state for Tasks RCU.
So how about I rename cond_resched_rcu_qs() to cond_resched_tasks_rcu_qs(),
which at least gives a hint as to where it needs to be used?
Would that work for you?
Thanx, Paul
> -- Steve
>
>
>
> > + return 0;
> > +}
> > #else
> > static inline int _cond_resched(void) { return 0; }
> > #endif
>