Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17

From: Peter Zijlstra
Date: Sun Mar 27 2016 - 16:46:24 EST


On Sun, Mar 27, 2016 at 08:40:18AM -0700, Paul E. McKenney wrote:
> Oh, and the patch I am running with is below. I am running x86, and so
> some other architectures would of course need the corresponding patch
> on that architecture.

> -#define TIF_POLLING_NRFLAG 21 /* idle is polling for TIF_NEED_RESCHED */
> +/* #define TIF_POLLING_NRFLAG 21 idle is polling for TIF_NEED_RESCHED */

x86 is the only arch that really uses this heavily IIRC.

Most of the other archs need interrupts to wake up remote cores.

So what we try to do is avoid sending IPIs when the CPU is idle, for the
remote wakeup case we use set_nr_if_polling() which sets
TIF_NEED_RESCHED if TIF_POLLING_NRFLAG was set. If it wasn't, we'll send
the IPI. Otherwise we rely on the idle loop to do sched_ttwu_pending()
when it breaks out of loop due to TIF_NEED_RESCHED.

But, you need hotplug for this to happen, right?

We should not be migrating towards, or waking on, CPUs no longer present
in cpu_active_map, and there is a rcu/sched_sync() after clearing that
bit. Furthermore, migration_call() does a sched_ttwu_pending() (waking
any remaining stragglers) before we migrate all runnable tasks off the
dying CPU.



The other interesting case would be resched_cpu(), which uses
set_nr_and_not_polling() to kick a remote cpu to call schedule(). It
atomically sets TIF_NEED_RESCHED and returns if TIF_POLLING_NRFLAG was
not set. If indeed not, it will send an IPI.

This assumes the idle 'exit' path will do the same as the IPI does; and
if you look at cpu_idle_loop() it does indeed do both
preempt_fold_need_resched() and sched_ttwu_pending().

Note that one cannot rely on irq_enter()/irq_exit() being called for the
scheduler IPI.