Re: [RFC v2 4/5] rcu: Use for_each_leaf_node_cpu() in force_qs_rnp()

From: Boqun Feng
Date: Tue Dec 20 2016 - 23:18:15 EST


On Tue, Dec 20, 2016 at 07:40:24PM -0800, Paul E. McKenney wrote:
[...]
> >
> > Agreed, my intent is to keep this overcare check for couples of releases
> > and if no one shoots his/her foot, we can remove it, if not, it
> > definitely means this part is subtle, and we need to pay more attention
> > to it, maybe write some regression tests for this particular problem to
> > help developers avoid it.
> >
> > This check is supposed to be removed, so I'm not stick to keeping it.
>
> I suggest keeping through validation. If it triggers during that time,
> consider keeping it longer. If it does not trigger, remove it before
> it goes upstream.
>

Good point ;-)

[...]
> > > >
> > > > But this brings a side question, is the callsite of rcu_cpu_starting()
> > > > is correct? Given rcu_cpu_starting() ignores the @cpu parameter and only
> > > > set _this_ cpu's bit in a leaf node?
> > >
> > > The calls from notify_cpu_starting() are called from the various
> > > start_kernel_secondary(), secondary_start_kernel(), and similarly
> > > named functions. These are called on the incoming CPU early in that
> > > CPU's execution. The call from rcu_init() is correct until such time
> > > as more than one CPU can be running at rcu_init() time. And that
> > > day might be coming, so please see the untested patch below.
> >
> > Looks better than mine ;-)
> >
> > But do we need to worry that we start rcu on each CPU twice, which may
> > slow down the boot?
>
> We only start a given CPU once. The boot CPU at rcu_init() time, and
> the rest at CPU-hotplug time. Unless of course a CPU is later taken

Confused... we call rcu_cpu_starting() in a for_each_online_cpu() loop
in rcu_init(), so we basically start all online CPUs there after
applying your patch. And all the rest CPUs will get themselves start
again at CPU-hotplug time, right?

Besides, without your patch, we started the boot CPU many times in the
for_each_online_cpu() loop.

Am I missing something subtle?

Regards,
Boqun

> offline, in which case we start it again when it comes back online.
>
> Thanx, Paul
>
> > Regards,
> > Boqun
> >
> > > Thanx, Paul
> > >
> > > ------------------------------------------------------------------------
> > >
> > > commit 1e84402587173d6d4da8645689f0e24c877b3269
> > > Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > Date: Tue Dec 20 07:17:58 2016 -0800
> > >
> > > rcu: Make rcu_cpu_starting() use its "cpu" argument
> > >
> > > The rcu_cpu_starting() function uses this_cpu_ptr() to locate the
> > > incoming CPU's rcu_data structure. This works for the boot CPU and for
> > > all CPUs onlined after rcu_init() executes (during very early boot).
> > > Currently, this is the full set of CPUs, so all is well. But if
> > > anyone ever parallelizes boot before rcu_init() time, it will fail.
> > > This commit therefore substitutes the rcu_cpu_starting() function's
> > > this_cpu_pointer() for per_cpu_ptr(), future-proofing the code and
> > > (arguably) improving readability.
> > >
> > > Reported-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > >
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index b9d3c0e30935..083cb8a6299c 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -4017,7 +4017,7 @@ void rcu_cpu_starting(unsigned int cpu)
> > > struct rcu_state *rsp;
> > >
> > > for_each_rcu_flavor(rsp) {
> > > - rdp = this_cpu_ptr(rsp->rda);
> > > + rdp = per_cpu_ptr(rsp->rda, cpu);
> > > rnp = rdp->mynode;
> > > mask = rdp->grpmask;
> > > raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > >
>
>

Attachment: signature.asc
Description: PGP signature