Re: [PATCH tip/core/rcu 06/10] trace: Eliminate cond_resched_rcu_qs() in favor of cond_resched()

From: Paul E. McKenney
Date: Sun Feb 25 2018 - 13:39:32 EST


On Sun, Feb 25, 2018 at 10:17:30AM -0800, Paul E. McKenney wrote:
> On Sun, Feb 25, 2018 at 09:49:27AM -0800, Paul E. McKenney wrote:
> > On Sat, Feb 24, 2018 at 03:12:40PM -0500, Steven Rostedt wrote:
> > > On Fri, 1 Dec 2017 11:21:40 -0800
> > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > Now that cond_resched() also provides RCU quiescent states when
> > > > needed, it can be used in place of cond_resched_rcu_qs(). This
> > > > commit therefore makes this change.
> > >
> > > Are you sure this is true?
> >
> > Up to a point. If a given CPU has been blocking an RCU grace period for
> > long enough, that CPU's rcu_dynticks.rcu_need_heavy_qs will be set, and
> > then the next cond_resched() will be treated as a cond_resched_rcu_qs().
> >
> > However, to your point, if there is no grace period in progress or if
> > the current grace period is not waiting on the CPU in question or if
> > the grace-period kthread is starved of CPU, then cond_resched() has no
> > effect on RCU. Unless of course it results in a context switch.
> >
> > > I just bisected a lock up on my machine down to this commit.
> > >
> > > With CONFIG_TRACEPOINT_BENCHMARK=y
> > >
> > > # cd linux.git/tools/testing/selftests/ftrace/
> > > # ./ftracetest test.d/ftrace/func_traceonoff_triggers.tc
> > >
> > > Locks up with a backtrace of:
> > >
> > > [ 614.186509] INFO: rcu_tasks detected stalls on tasks:
> >
> > Ah, but this is RCU-tasks! Which never sets rcu_dynticks.rcu_need_heavy_qs,
> > thus needing a real context switch.
> >
> > Hey, when you said that synchronize_rcu_tasks() could take a very long
> > time, I took you at your word! ;-)
> >
> > Does the following (untested, probably does not even build) patch make
> > cond_resched() take a more peremptory approach to RCU-tasks?
>
> And probably not. You are probably running CONFIG_PREEMPT=y (otherwise
> RCU-tasks is trivial), so cond_resched() is a complete no-op:
>
> static inline int _cond_resched(void) { return 0; }
>
> I could make this call rcu_all_qs(), but I would not expect Peter Zijlstra
> to be at all happy with that sort of change.
>
> And the people who asked for the cond_resched() work probably aren't
> going to be happy with the resumed proliferation of cond_resched_rcu_qs().
>
> Hmmm... Grasping at straws... Could we make cond_resched() be something
> like a tracepoint and instrument them with cond_resched_rcu_qs() if the
> current RCU-tasks grace period ran for more that (say) a minute of its
> ten-minute stall-warning span?

On the other hand, you noted in your other email that the tracepoint
benchmark should not be enabled on production systems. So how about
the following (again untested) patch? The "defined(CONFIG_TASKS_RCU)"
might need to change, especially if RCU-tasks is used in production
kernels, but perhaps a starting point.

Thanx, Paul

------------------------------------------------------------------------

diff --git a/include/linux/sched.h b/include/linux/sched.h
index b161ef8a902e..316c29c5e506 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1589,6 +1589,12 @@ static inline int test_tsk_need_resched(struct task_struct *tsk)
*/
#ifndef CONFIG_PREEMPT
extern int _cond_resched(void);
+#elif defined(CONFIG_TASKS_RCU)
+static inline int _cond_resched(void)
+{
+ rcu_note_voluntary_context_switch(current);
+ return 0;
+}
#else
static inline int _cond_resched(void) { return 0; }
#endif