Re: [RFC] rcu: Warn that rcu ktheads cannot be spawned

From: Paul E. McKenney
Date: Wed Jun 26 2019 - 12:30:55 EST


On Wed, Jun 26, 2019 at 11:51:20AM +0900, Byungchul Park wrote:
> On Tue, Jun 25, 2019 at 06:31:15AM -0700, Paul E. McKenney wrote:
> > On Tue, Jun 25, 2019 at 11:41:00AM +0900, Byungchul Park wrote:
> > > On Mon, Jun 24, 2019 at 10:25:51AM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jun 24, 2019 at 12:46:24PM -0400, Joel Fernandes wrote:
> > > > > On Mon, Jun 24, 2019 at 05:27:32PM +0900, Byungchul Park wrote:
> > > > > > Hello rcu folks,
> > > > > >
> > > > > > I thought it'd better to announce it if those spawnings fail because of
> > > > > > !rcu_scheduler_fully_active.
> > > > > >
> > > > > > Of course, with the current code, it never happens though.
> > > > > >
> > > > > > Thoughts?
> > > > >
> > > > > It seems in the right spirit, but with your patch a warning always fires.
> > > > > rcu_prepare_cpu() is called multiple times, once from rcu_init() and then
> > > > > from hotplug paths.
> > > > >
> > > > > Warning splat stack looks like:
> > > > >
> > > > > [ 0.398767] Call Trace:
> > > > > [ 0.398775] rcu_init+0x6aa/0x724
> > > > > [ 0.398779] start_kernel+0x220/0x4a2
> > > > > [ 0.398780] ? copy_bootdata+0x12/0xac
> > > > > [ 0.398782] secondary_startup_64+0xa4/0xb0
> > > >
> > > > Thank you both, and I will remove this from my testing queue.
> > > >
> > > > As Joel says, this is called at various points in the boot sequence, not
> > > > all of which are far enough along to support spawning kthreads.
> > > >
> > > > The real question here is "What types of bugs are we trying to defend
> > > > against?" But keeping in mind existing diagnostics. For example, are
> > > > there any kthreads for which a persistent failure to spawn would not
> > > > emit any error message. My belief is that any such persistent failure
> > > > would result in either an in-kernel diagnostic or an rcutorture failure,
> > > > but I might well be missing something.
> > > >
> > > > Thoughts? Or, more to the point, tests demonstrating silence in face
> > > > of such a persistent failure?
> > >
> > > You are right. There wouldn't be a persistent failure because the path
> > > turning cpus on always tries to spawn them, *even* in case that the
> > > booting sequence is wrong. The current code anyway goes right though.
> > >
> > > I thought a hole can be there if the code changes so that those kthreads
> > > cannot be spawned until the cpu being up, which is the case I was
> > > interested in. Again, it's gonna never happen with the current code
> > > because it spawns them after setting rcu_scheduler_fully_active to 1 in
> > > rcu_spawn_gp_kthead().
> > >
> > > And I wrongly thought you placed the rcu_scheduler_fully_active check on
> > > spawning just in case. But it seems to be not the case.
> > >
> > > So I'd better stop working on the warning patch. :) Instead, please
> > > check the following trivial fix.
> > >
> > > Thanks,
> > > Byungchul
> > >
> > > ---8<---
> > > >From 1293d19bb7abf7553d656c81182118eff54e7dc9 Mon Sep 17 00:00:00 2001
> > > From: Byungchul Park <byungchul.park@xxxxxxx>
> > > Date: Mon, 24 Jun 2019 16:22:11 +0900
> > > Subject: [PATCH] rcu: Make rcu_spawn_one_boost_kthread() return void
> > >
> > > The return value of rcu_spawn_one_boost_kthread() is not used any
> > > longer. Change the return type from int to void.
> > >
> > > Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
> >
> > Looks pretty good, just one comment below. Plus could you please tell
> > me what you are developing this against?
>
> Only for removing out-dated code. Not for funcational change.

I was actually expecting an answer of the form "current mainline",
"v5.2-rc3", "-rcu branch dev", or some such. If need be, I will ask
again for your next version. ;-)

> And simple answer below...
>
> >
> > Thanx, Paul
> >
> > > ---
> > > kernel/rcu/tree_plugin.h | 15 +++++++--------
> > > 1 file changed, 7 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > index 1102765..4e11aa4 100644
> > > --- a/kernel/rcu/tree_plugin.h
> > > +++ b/kernel/rcu/tree_plugin.h
> > > @@ -1131,7 +1131,7 @@ static void rcu_preempt_boost_start_gp(struct rcu_node *rnp)
> > > * already exist. We only create this kthread for preemptible RCU.
> > > * Returns zero if all is well, a negated errno otherwise.
> > > */
> > > -static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
> > > +static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
> > > {
> > > int rnp_index = rnp - rcu_get_root();
> > > unsigned long flags;
> > > @@ -1139,25 +1139,24 @@ static int rcu_spawn_one_boost_kthread(struct rcu_node *rnp)
> > > struct task_struct *t;
> > >
> > > if (!IS_ENABLED(CONFIG_PREEMPT_RCU))
> > > - return 0;
> > > + return;
> > >
> > > if (!rcu_scheduler_fully_active || rcu_rnp_online_cpus(rnp) == 0)
> > > - return 0;
> > > + return;
> > >
> > > rcu_state.boost = 1;
> > > if (rnp->boost_kthread_task != NULL)
> > > - return 0;
> > > + return;
> > > t = kthread_create(rcu_boost_kthread, (void *)rnp,
> > > "rcub/%d", rnp_index);
> > > if (IS_ERR(t))
> >
> > This would be a change in behavior, but it might be good to have a
> > WARN_ON_ONCE() above. Assuming that it doesn't splat on every boot. ;-)
>
> Yes. Normally it shouldn't. Right? I will resend this after testing with
> my machine. :)

My guess is that it will not splat, but if it does, it would not be the
first time that RCU has surprised me. ;-)

Thanx, Paul

> Thanks,
> Byungchul
>
> >
> > > - return PTR_ERR(t);
> > > + return;
> > > raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > rnp->boost_kthread_task = t;
> > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> > > sp.sched_priority = kthread_prio;
> > > sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> > > wake_up_process(t); /* get to TASK_INTERRUPTIBLE quickly. */
> > > - return 0;
> > > }
> > >
> > > static void rcu_cpu_kthread_setup(unsigned int cpu)
> > > @@ -1265,7 +1264,7 @@ static void __init rcu_spawn_boost_kthreads(void)
> > > if (WARN_ONCE(smpboot_register_percpu_thread(&rcu_cpu_thread_spec), "%s: Could not start rcub kthread, OOM is now expected behavior\n", __func__))
> > > return;
> > > rcu_for_each_leaf_node(rnp)
> > > - (void)rcu_spawn_one_boost_kthread(rnp);
> > > + rcu_spawn_one_boost_kthread(rnp);
> > > }
> > >
> > > static void rcu_prepare_kthreads(int cpu)
> > > @@ -1275,7 +1274,7 @@ static void rcu_prepare_kthreads(int cpu)
> > >
> > > /* Fire up the incoming CPU's kthread and leaf rcu_node kthread. */
> > > if (rcu_scheduler_fully_active)
> > > - (void)rcu_spawn_one_boost_kthread(rnp);
> > > + rcu_spawn_one_boost_kthread(rnp);
> > > }
> > >
> > > #else /* #ifdef CONFIG_RCU_BOOST */
> > > --
> > > 1.9.1
> > >
>