Re: cpu stopper threads and load balancing leads to deadlock
From: Paul E. McKenney
Date: Thu May 03 2018 - 12:11:21 EST
On Thu, May 03, 2018 at 04:44:50PM +0200, Peter Zijlstra wrote:
> On Thu, May 03, 2018 at 04:16:55PM +0200, Mike Galbraith wrote:
> > On Thu, 2018-05-03 at 15:56 +0200, Peter Zijlstra wrote:
> > > On Thu, May 03, 2018 at 03:32:39PM +0200, Mike Galbraith wrote:
> > >
> > > > Dang. With $subject fix applied as well..
> > >
> > > That's a NO then... :-(
> >
> > Could say who cares about oddball offline wakeup stat. <cringe>
>
> Yeah, nobody.. but I don't want to have to change the wakeup code to
> deal with this if at all possible. That'd just add conditions that are
> 'always' false, except in this exceedingly rare circumstance.
>
> So ideally we manage to tell RCU that it needs to pay attention while
> we're doing this here thing, which is what I thought RCU_NONIDLE() was
> about.
One straightforward approach would be to provide a arch-specific
Kconfig option that tells notify_cpu_starting() not to bother invoking
rcu_cpu_starting(). Then x86 selects this Kconfig option and invokes
rcu_cpu_starting() itself early enough to avoid splats.
See the (untested, probably does not even build) patch below.
I have no idea where to insert either the "select" or the call to
rcu_cpu_starting(), so I left those out. I know that putting the
call too early will cause trouble, but I have no idea what constitutes
"too early". :-/
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 0db8938fbb23..58f7ea1de247 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -948,7 +948,8 @@ void notify_cpu_starting(unsigned int cpu)
enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
int ret;
- rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
+ if (!IS_ENABLED(CONFIG_RCU_CPU_ONLINE_EARLY))
+ rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
while (st->state < target) {
st->state++;
ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 9210379c0353..a874c0d74797 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -238,4 +238,7 @@ config RCU_NOCB_CPU
Say Y here if you want to help to debug reduced OS jitter.
Say N here if you are unsure.
+config RCU_CPU_ONLINE_EARLY
+ bool
+
endmenu # "RCU Subsystem"