Re: [RFC] Make need_resched() return true when rcu_urgent_qs requested

From: Paul E. McKenney
Date: Mon Jul 09 2018 - 12:32:24 EST


On Mon, Jul 09, 2018 at 05:26:32PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 09, 2018 at 07:29:32AM -0700, Paul E. McKenney wrote:
> > OK, so here are our options:
> >
> > 1. Add the RCU conditional to need_resched(), as David suggests.
> > Peter has concerns about overhead.
> >
> > 2. Create a new need_resched_rcu_qs() that is to be used when
> > deciding whether or not to do cond_resched(). This would
> > exact the overhead only where it is needed, but is one more
> > thing for people to get wrong.
>
> Also, with the crypto guys checking need_resched() in asm that won't
> really work either.

Fair point! Ease of use is a good thing, even within the Linux kernel.
Or maybe especially within the Linux kernel...

> > 3. Revert my changes to de-emphasize cond_resched_rcu_qs(),
> > and go back to sprinkling cond_resched_rcu_qs() throughout
> > the code. This also is one more thing for people to get wrong,
> > and might well eventually convert all cond_resched() calls to
> > cond_resched_rcu_qs(), which sure seems like a failure mode to me.
>
> 4a. use resched_cpu() more agressive
> 4b. use the tick to set TIF_NEED_RESCHED when it finds rcu_urgent_qs
> (avoids the IPI at the 'cost' of a slight delay in processing)

4b sounds eminently reasonable to me! Something like the (untested,
probably doesn't even build) patch below?

David, any reason why this wouldn't work? Seems to me that this would
make need_resched() respond to RCU's need for quiescent states in a
timely manner without need_resched() having to become heavier weight,
but figured I should ask.

> 5. make guest mode a quiescent state (like supposedly already done
> for NOHZ_FULL) (but this would not help the crypto guys).
>
> 6. ....
>
> ok I ran out of ideas here I think.
>
>
> So for PREEMPT the tick can check preempt_count() == 0 and if so, know
> it _could_ have rescheduled and advance the qs, right? But since we
> don't have a preempt count for !PREEMPT_COUNT kernels this doesn't work.
>
> And thus we need to invoke actual scheduling events and then through the
> schedule() callback RCU knows things happened.
>
> 4b seems like something worth trying for !PREEMPT kernels I suppose

David is running a !PREEMPT kernel.

For PREEMPT kernels, the patch below results in a quiescent state for
the CPU, and the forced schedule queues the task. This queuing enables
later RCU priority boosting (if enabled) once all other CPUs sharing
the same leaf rcu_node structure have passed through quiescent states.

And yes, for PREEMPT kernels the scheduling-clock interrupt handler
already checks for a quiescent state using a combination of
preempt_count() (as you say, but ignoring the hardirq bits because
we are in an interrupt handler) and current->rcu_read_lock_nesting.

So I believe that this will cover it.

Thoughts?

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 51919985f6cf..33b0a1ec0536 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2496,6 +2496,10 @@ void rcu_check_callbacks(int user)
{
trace_rcu_utilization(TPS("Start scheduler-tick"));
raw_cpu_inc(rcu_data.ticks_this_gp);
+ if (smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)) &&
+ !is_idle_task(current))
+ set_tsk_need_resched(current);
+ __this_cpu_write(rcu_dynticks.rcu_urgent_qs, false);
rcu_flavor_check_callbacks(user);
if (rcu_pending())
invoke_rcu_core();