Re: [PATCH v2 rcu 04/18] rcu: Weaken ->dynticks accesses and updates

From: Paul E. McKenney
Date: Wed Jul 28 2021 - 15:45:08 EST


On Wed, Jul 28, 2021 at 11:58:54AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 28, 2021 at 02:23:05PM -0400, Mathieu Desnoyers wrote:
> > ----- On Jul 28, 2021, at 1:37 PM, paulmck paulmck@xxxxxxxxxx wrote:
> > [...]
> > >
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 42a0032dd99f7..c87b3a271d65b 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -251,6 +251,15 @@ void rcu_softirq_qs(void)
> > > rcu_tasks_qs(current, false);
> > > }
> > >
> > > +/*
> > > + * Increment the current CPU's rcu_data structure's ->dynticks field
> > > + * with ordering. Return the new value.
> > > + */
> > > +static noinstr unsigned long rcu_dynticks_inc(int incby)
> > > +{
> > > + return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks));
> > > +}
> > > +
> >
> > [...]
> >
> > > @@ -308,7 +317,7 @@ static void rcu_dynticks_eqs_online(void)
> > >
> > > if (atomic_read(&rdp->dynticks) & 0x1)
> > > return;
> >
> > Can the thread be migrated at this point ? If yes, then
> > the check and the increment may happen on different cpu's rdps. Is
> > that OK ?
>
> Good point! Actually, it can be migrated, but it does not matter.
> In fact, it so completely fails to matter that is is totally useless. :-/
>
> The incoming CPU is still offline, so this is run from some other
> completely-online CPU. Because this CPU is executing in non-idle
> kernel context, that "if" condition must evaluate to true, so that the
> rcu_dynticks_inc() below is dead code.
>
> Maybe I should move the call to rcu_dynticks_eqs_online() to
> rcu_cpu_starting(), which is pinned to the incoming CPU. Yes, I
> could remove it completely, but then small changes in the offline
> process could cause great mischief.
>
> Good catch, thank you!

And how about like this?

Thanx, Paul

------------------------------------------------------------------------

commit cb8914dcc6443cca15ce48d937a93c0dfdb114d3
Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
Date: Wed Jul 28 12:38:42 2021 -0700

rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()

The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
counter of an incoming CPU if required. It is currently is invoked
from rcutree_prepare_cpu(), which runs before the incoming CPU is
running, and thus on some other CPU. This makes the per-CPU accesses in
rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
the running CPU cannot possibly be in dyntick-idle mode, which means
that rcu_dynticks_eqs_online() never has any effect. One could argue
that this means that rcu_dynticks_eqs_online() is unnecessary, however,
removing it makes the CPU-online process vulnerable to slight changes
in the CPU-offline process.

This commit therefore moves the call to rcu_dynticks_eqs_online() from
rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
to be running on the incoming CPU. The call to this function must of
course be placed before this rcu_cpu_starting() announces this CPU's
presence to RCU.

Reported-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0172a5fd6d8de..aa00babdaf544 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4129,7 +4129,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
rdp->blimit = blimit;
rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */
- rcu_dynticks_eqs_online();
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */

/*
@@ -4249,6 +4248,7 @@ void rcu_cpu_starting(unsigned int cpu)
mask = rdp->grpmask;
WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
+ rcu_dynticks_eqs_online();
smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
raw_spin_lock_irqsave_rcu_node(rnp, flags);
WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);