Re: [PATCH tip/core/rcu 2/2] rcu: Make expedited GPs handle CPU 0 being offline
From: Paul E. McKenney
Date: Tue Jun 26 2018 - 08:29:57 EST
On Tue, Jun 26, 2018 at 07:46:52PM +0800, Boqun Feng wrote:
> On Tue, Jun 26, 2018 at 06:44:47PM +0800, Boqun Feng wrote:
> > On Tue, Jun 26, 2018 at 11:38:20AM +0200, Peter Zijlstra wrote:
> > > On Mon, Jun 25, 2018 at 03:43:32PM -0700, Paul E. McKenney wrote:
> > > > + preempt_disable();
> > > > + for_each_leaf_node_possible_cpu(rnp, cpu) {
> > > > + if (cpu_is_offline(cpu)) /* Preemption disabled. */
> > > > + continue;
> > >
> > > Create for_each_node_online_cpu() instead? Seems a bit pointless to
> > > iterate possible mask only to then check it against the online mask.
> > > Just iterate the online mask directly.
> > >
> > > Or better yet, write this as:
> > >
> > > preempt_disable();
> > > cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
> > > if (cpu > rnp->grphi)
> > > cpu = WORK_CPU_UNBOUND;
> > > queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
> > > preempt_enable();
> > >
> > > Which is what it appears to be doing.
> > >
> >
> > Make sense! Thanks ;-)
> >
> > Applied this and running a TREE03 rcutorture. If all go well, I will
> > send the updated patch.
>
> So the patch has passed one 30 min run for TREE03 rcutorture. Paul,
> if it looks good, could you take it for your next spin or pull request
> in the future? Thanks.
Looks much improved, thank you both! I will pull this in later
today, Pacific Time.
Thanx, Paul
> Regards,
> Boqun
>
> -------------->8
> Subject: [PATCH tip/core/rcu v2] rcu: exp: Make expedited GPs handle CPU 0 being offline
>
> Currently, the parallelized initialization of expedited grace periods
> uses the workqueue associated with each rcu_node structure's ->grplo
> field. This works fine unless that CPU is offline. This commit
> therefore uses the CPU corresponding to the lowest-numbered online CPU,
> or fallback to queue the work on WORK_CPU_UNBOUND if there are no online
> CPUs on this rcu_node structure.
>
> Note that this patch uses cpu_online_mask instead of the usual approach
> of checking bits in the rcu_node structure's ->qsmaskinitnext field.
> This is safe because preemption is disabled across both the
> cpu_online_mask check and the call to queue_work_on().
>
> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> [ paulmck: Disable preemption to close offline race window. ]
> Not-Yet-Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> ---
> v1 --> v2:
>
> * Replace the for_each_leaf_node_possible_cpu() + cpu_is_offline()
> check loop with a single cpumask_next() as suggested by Peter
> Zijlstra
>
> kernel/rcu/tree_exp.h | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index d40708e8c5d6..3bf87fd4bd91 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -473,6 +473,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> smp_call_func_t func)
> {
> struct rcu_node *rnp;
> + int cpu;
>
> trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("reset"));
> sync_exp_reset_tree(rsp);
> @@ -492,7 +493,13 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
> continue;
> }
> INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
> - queue_work_on(rnp->grplo, rcu_par_gp_wq, &rnp->rew.rew_work);
> + preempt_disable();
> + cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
> + /* All offlines, queue the work on an unbound CPU */
> + if (unlikely(cpu > rnp->grphi))
> + cpu = WORK_CPU_UNBOUND;
> + queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
> + preempt_enable();
> rnp->exp_need_flush = true;
> }
>
> --
> 2.17.1
>