Re: [PATCH v2 rcu 04/18] rcu: Weaken ->dynticks accesses and updates

From: Paul E. McKenney
Date: Thu Jul 29 2021 - 11:58:25 EST


On Thu, Jul 29, 2021 at 10:41:18AM -0400, Mathieu Desnoyers wrote:
> ----- On Jul 28, 2021, at 4:28 PM, paulmck paulmck@xxxxxxxxxx wrote:
>
> > On Wed, Jul 28, 2021 at 04:03:02PM -0400, Mathieu Desnoyers wrote:
> >> ----- On Jul 28, 2021, at 3:45 PM, paulmck paulmck@xxxxxxxxxx wrote:
> >> [...]
> >> >
> >> > And how about like this?
> >> >
> >> > Thanx, Paul
> >> >
> >> > ------------------------------------------------------------------------
> >> >
> >> > commit cb8914dcc6443cca15ce48d937a93c0dfdb114d3
> >> > Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >> > Date: Wed Jul 28 12:38:42 2021 -0700
> >> >
> >> > rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
> >> >
> >> > The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
> >> > counter of an incoming CPU if required. It is currently is invoked
> >>
> >> "is currently is" -> "is currently"
> >
> > Good catch, fixed!
> >
> >> > from rcutree_prepare_cpu(), which runs before the incoming CPU is
> >> > running, and thus on some other CPU. This makes the per-CPU accesses in
> >> > rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
> >> > the running CPU cannot possibly be in dyntick-idle mode, which means
> >> > that rcu_dynticks_eqs_online() never has any effect. One could argue
> >> > that this means that rcu_dynticks_eqs_online() is unnecessary, however,
> >> > removing it makes the CPU-online process vulnerable to slight changes
> >> > in the CPU-offline process.
> >>
> >> Why favor moving this from the prepare_cpu to the cpu_starting hotplug step,
> >> rather than using the target cpu's rdp from rcutree_prepare_cpu ? Maybe there
> >> was a good reason for having this very early in the prepare_cpu step ?
> >
> > Some years back, there was a good reason. This reason was that
> > rcutree_prepare_cpu() marked the CPU as being online from an RCU
> > viewpoint. But now rcu_cpu_starting() is the one that marks the CPU as
> > being online, so the ->dynticks check can be deferred to this function.
> >
> >> Also, the commit message refers to this bug as having no effect because the
> >> running CPU cannot possibly be in dyntick-idle mode. I understand that calling
> >> this function was indeed effect-less, but then why is it OK for the CPU coming
> >> online to skip this call in the first place ? This commit message hints at
> >> "slight changes in the CPU-offline process" which could break it, but therer is
> >> no explanation of what makes this not an actual bug fix.
> >
> > Because rcutorture would not have suffered in silence had this
> > situation ever arisen.
>
> Testing can usually prove the presence of a bug, but it's rather tricky to prove
> the absence of bug.

In general, true enough.

But in this particular case, a WARN would have deterministically triggered
the very next time that this CPU found its way either to the idle loop
or to nohz_full usermode execution.

> > I have updated the commit log to answer these questions as shown
> > below. Thoughts?
>
> I'm still concerned about one scenario wrt moving rcu_dynticks_eqs_online()
> from rcutree_prepare_cpu to rcu_cpu_starting. What happens if an interrupt
> handler, or a NMI handler, nests early over the CPU-online startup code ?
> AFAIU, this interrupt handler could contain RCU read-side critical sections,
> but if the eqs state does not show the CPU as "online", I wonder whether it
> will work as expected.

Interrupts are still disabled at this point in the onlining process,
my _irqsave() locks notwithstanding.

You are right about NMI handlers, but there would be much more damage
from an early NMI handler than mere RCU issues. And this can be handled
as described in the next paragraph.

Now, there are architectures (including x86) that need RCU readers
before notify_cpu_starting() time (which is where rcu_cpu_starting()
is invoked by default, and those architectures can (and do) simply
place a call to rcu_cpu_starting() at an earlier appropriate point in
the architecture-specific CPU-bringup code. And this is in fact the
reason for the ->cpu_started check at the beginning of rcu_cpu_starting().
So an architecture using NMIs early in the CPU-bringup code should
invoke rcu_cpu_starting() before enabling NMIs.

So why not just be safe and mark the CPU online early in the process?

Because that could result in unbounded grace periods and strange
deadlocks. These deadlocks were broken earlier by code that assumed that
a CPU could not possibly take more than one jiffy to come online, but that
assumption is clearly broken on todays systems, even the bare-metal ones.

In theory, I could change the raw_spin_lock_irqsave_rcu_node() to
raw_spin_lock_rcu_node(), rely on the lockdep_assert_irqs_disabled()
in the matching raw_spin_unlock_rcu_node(), and ditch the "flags"
local variable, but rcu_report_qs_rnp() needs that "flags" argument.

OK, I guess one approach is to get the "flags" value from local_save_flags()
and get rid of the _irqsave and _irq restore. Assuming lockdep is
functional that early in CPU bringup.

But would that really be better than status quo?

Thanx, Paul

> Thanks,
>
> Mathieu
>
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit 516c8c4cc6fce62539f7e0182739812db4591c1d
> > Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Date: Wed Jul 28 12:38:42 2021 -0700
> >
> > rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
> >
> > The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
> > counter of an incoming CPU when required. It is currently invoked
> > from rcutree_prepare_cpu(), which runs before the incoming CPU is
> > running, and thus on some other CPU. This makes the per-CPU accesses in
> > rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
> > the running CPU cannot possibly be in dyntick-idle mode, which means
> > that rcu_dynticks_eqs_online() never has any effect.
> >
> > It is currently OK for rcu_dynticks_eqs_online() to have no effect, but
> > only because the CPU-offline process just happens to leave ->dynticks in
> > the correct state. After all, if ->dynticks were in the wrong state on a
> > just-onlined CPU, rcutorture would complain bitterly the next time that
> > CPU went idle, at least in kernels built with CONFIG_RCU_EQS_DEBUG=y,
> > for example, those built by rcutorture scenario TREE04. One could
> > argue that this means that rcu_dynticks_eqs_online() is unnecessary,
> > however, removing it would make the CPU-online process vulnerable to
> > slight changes in the CPU-offline process.
> >
> > One could also ask why it is safe to move the rcu_dynticks_eqs_online()
> > call so late in the CPU-online process. Indeed, there was a time when it
> > would not have been safe, which does much to explain its current location.
> > However, the marking of a CPU as online from an RCU perspective has long
> > since moved from rcutree_prepare_cpu() to rcu_cpu_starting(), and all
> > that is required is that ->dynticks be set correctly by the time that
> > the CPU is marked as online from an RCU perspective. After all, the RCU
> > grace-period kthread does not check to see if offline CPUs are also idle.
> > (In case you were curious, this is one reason why there is quiescent-state
> > reporting as part of the offlining process.)
> >
> > This commit therefore moves the call to rcu_dynticks_eqs_online() from
> > rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
> > to be running on the incoming CPU. The call to this function must of
> > course be placed before this rcu_cpu_starting() announces this CPU's
> > presence to RCU.
> >
> > Reported-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 0172a5fd6d8de..aa00babdaf544 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -4129,7 +4129,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
> > rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> > rdp->blimit = blimit;
> > rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */
> > - rcu_dynticks_eqs_online();
> > raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
> >
> > /*
> > @@ -4249,6 +4248,7 @@ void rcu_cpu_starting(unsigned int cpu)
> > mask = rdp->grpmask;
> > WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
> > WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
> > + rcu_dynticks_eqs_online();
> > smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
> > raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com