Re: [PATCH 02/11] rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion

From: David Woodhouse
Date: Thu Dec 09 2021 - 13:53:11 EST


On Thu, 2021-12-09 at 09:18 -0800, Paul E. McKenney wrote:
> On Thu, Dec 09, 2021 at 03:09:29PM +0000, David Woodhouse wrote:
> > From: David Woodhouse <
> > dwmw@xxxxxxxxxxxx
> > >
> >
> > If we allow architectures to bring APs online in parallel, then we end
> > up requiring rcu_cpu_starting() to be reentrant. But currently, the
> > manipulation of rnp->ofl_seq is not thread-safe.
> >
> > However, rnp->ofl_seq is also fairly much pointless anyway since both
> > rcu_cpu_starting() and rcu_report_dead() hold rcu_state.ofl_lock for
> > fairly much the whole time that rnp->ofl_seq is set to an odd number
> > to indicate that an operation is in progress.
> >
> > So drop rnp->ofl_seq completely, and use only rcu_state.ofl_lock.
> >
> > This has a couple of minor complexities: lockdep will complain when we
> > take rcu_state.ofl_lock, and currently accepts the 'excuse' of having
> > an odd value in rnp->ofl_seq. So switch it to an arch_spinlock_t to
> > avoid that false positive complaint. Since we're killing rnp->ofl_seq
> > of course that 'excuse' has to be changed too, so make it check for
> > arch_spin_is_locked(rcu_state.ofl_lock).
> >
> > There's no arch_spin_lock_irqsave() so we have to manually save and
> > restore local interrupts around the locking.
> >
> > At Paul's request, make rcu_gp_init not just wait but *exclude* any
> > CPU online/offline activity, which was fairly much true already by
> > virtue of it holding rcu_state.ofl_lock.
>
> Looks good!
>
> Could you please also make the first clause read something like this?
>
> "At Paul's request based on Neeraj's analysis, ..."
>
> I am going to pull this into -rcu for more intensive testing of this
> code for non-concurrent CPU-online operations (making the above change
> to the commit log). At some point, rcutorture needs to learn how to
> do concurrent CPU-online operations, but it would be good to bake the
> RCU-specific patches for a bit beforehand.
>
> Depending on timing, you might wish to send this patch with the
> rest of this group, so:
>
> Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
>
> If testing goes well and if you don't get it there first, I expect
> to push this during the v5.18 merge window.


Thanks.

I think the best option is probably to let all the cleanups and fixes
get merged separately through whatever tree is most appropriate, and
then we can just merge that final patch to actually *enable* parallel
boot for x86 last of all.

I'll address Neeraj's feedback and repost this one.

Attachment: smime.p7s
Description: S/MIME cryptographic signature