Re: [PATCH v2 rcu 04/18] rcu: Weaken ->dynticks accesses and updates

From: Mathieu Desnoyers
Date: Wed Jul 28 2021 - 16:03:06 EST


----- On Jul 28, 2021, at 3:45 PM, paulmck paulmck@xxxxxxxxxx wrote:
[...]
>
> And how about like this?
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit cb8914dcc6443cca15ce48d937a93c0dfdb114d3
> Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> Date: Wed Jul 28 12:38:42 2021 -0700
>
> rcu: Move rcu_dynticks_eqs_online() to rcu_cpu_starting()
>
> The purpose of rcu_dynticks_eqs_online() is to adjust the ->dynticks
> counter of an incoming CPU if required. It is currently is invoked

"is currently is" -> "is currently"

> from rcutree_prepare_cpu(), which runs before the incoming CPU is
> running, and thus on some other CPU. This makes the per-CPU accesses in
> rcu_dynticks_eqs_online() iffy at best, and it all "works" only because
> the running CPU cannot possibly be in dyntick-idle mode, which means
> that rcu_dynticks_eqs_online() never has any effect. One could argue
> that this means that rcu_dynticks_eqs_online() is unnecessary, however,
> removing it makes the CPU-online process vulnerable to slight changes
> in the CPU-offline process.

Why favor moving this from the prepare_cpu to the cpu_starting hotplug step,
rather than using the target cpu's rdp from rcutree_prepare_cpu ? Maybe there
was a good reason for having this very early in the prepare_cpu step ?

Also, the commit message refers to this bug as having no effect because the
running CPU cannot possibly be in dyntick-idle mode. I understand that calling
this function was indeed effect-less, but then why is it OK for the CPU coming
online to skip this call in the first place ? This commit message hints at
"slight changes in the CPU-offline process" which could break it, but therer is
no explanation of what makes this not an actual bug fix.

Thanks,

Mathieu

>
> This commit therefore moves the call to rcu_dynticks_eqs_online() from
> rcutree_prepare_cpu() to rcu_cpu_starting(), this latter being guaranteed
> to be running on the incoming CPU. The call to this function must of
> course be placed before this rcu_cpu_starting() announces this CPU's
> presence to RCU.
>
> Reported-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 0172a5fd6d8de..aa00babdaf544 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -4129,7 +4129,6 @@ int rcutree_prepare_cpu(unsigned int cpu)
> rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs);
> rdp->blimit = blimit;
> rdp->dynticks_nesting = 1; /* CPU not up, no tearing. */
> - rcu_dynticks_eqs_online();
> raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
>
> /*
> @@ -4249,6 +4248,7 @@ void rcu_cpu_starting(unsigned int cpu)
> mask = rdp->grpmask;
> WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
> WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
> + rcu_dynticks_eqs_online();
> smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
> raw_spin_lock_irqsave_rcu_node(rnp, flags);
> WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com