Re: [PATCH] rcu: simplify the calculation of rcu_state.ncpus

From: Wei Yang
Date: Sat Apr 18 2020 - 18:02:50 EST


On Sat, Apr 18, 2020 at 11:19 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Fri, Apr 17, 2020 at 09:39:51PM +0000, Wei Yang wrote:
> > There is only 1 bit set in mask, which means the difference between
> > oldmask and the new one would be at the position where the bit is set in
> > mask.
> >
> > Based on this knowledge, rcu_state.ncpus could be calculated by checking
> > whether mask is already set in oldmask.
>
> Nice!!! Good eyes!
>
> > BTW, the comment at the last of this line is mysterious. Not sure it
> > could be removed or not.
>
> The "^^^" in that comment says to look at the comment on the preceding
> line. Memory-ordering functions like smp_store_release() are supposed
> to have comments indicating what they are ordering. ;-)
>
> Could you please do the following things and resubmit?
>
> 1. Forward-port to -rcu branch dev? This tree lives here:
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
>
> 2. Given that oldmask is used only to test to see if a new bit
> was set, why not just replace oldmask with a bool variable
> that is set to "!(rnp->expmaskinitnext & mask)" before the
> bit is ORed into rnp->expmaskinitnext?
>
> 3. Put the comment inside the "if" statement with the
> smp_store_release().
>
> 4. In -rcu, you will find a ASSERT_EXCLUSIVE_WRITER() statement
> that should also be placed inside the "if" statement with
> the smp_store_release().
>

Oops, my email client EAT this mail. Hope this mail will not be banned.

I adjust the code a little according to your suggestion like below. Is this
what you expected?

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f288477ee1c2..f01367a80b70 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3732,10 +3732,9 @@ void rcu_cpu_starting(unsigned int cpu)
{
unsigned long flags;
unsigned long mask;
- int nbits;
- unsigned long oldmask;
struct rcu_data *rdp;
struct rcu_node *rnp;
+ bool has_seen;

if (per_cpu(rcu_cpu_started, cpu))
return;
@@ -3747,13 +3746,13 @@ void rcu_cpu_starting(unsigned int cpu)
mask = rdp->grpmask;
raw_spin_lock_irqsave_rcu_node(rnp, flags);
WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
- oldmask = rnp->expmaskinitnext;
+ has_seen = rnp->expmaskinitnext & mask;
rnp->expmaskinitnext |= mask;
- oldmask ^= rnp->expmaskinitnext;
- nbits = bitmap_weight(&oldmask, BITS_PER_LONG);
- /* Allow lockless access for expedited grace periods. */
- smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */
- ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus);
+ if (!has_seen) {
+ /* Allow lockless access for expedited grace periods. */
+ smp_store_release(&rcu_state.ncpus, rcu_state.ncpus +
1); /* ^^^ */
+ ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus);
+ }
rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);

> Thanx, Paul
>
> > Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
> > ---
> > kernel/rcu/tree.c | 6 ++----
> > 1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index d91c9156fab2..f0d9251fa663 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3364,7 +3364,6 @@ void rcu_cpu_starting(unsigned int cpu)
> > {
> > unsigned long flags;
> > unsigned long mask;
> > - int nbits;
> > unsigned long oldmask;
> > struct rcu_data *rdp;
> > struct rcu_node *rnp;
> > @@ -3381,10 +3380,9 @@ void rcu_cpu_starting(unsigned int cpu)
> > rnp->qsmaskinitnext |= mask;
> > oldmask = rnp->expmaskinitnext;
> > rnp->expmaskinitnext |= mask;
> > - oldmask ^= rnp->expmaskinitnext;
> > - nbits = bitmap_weight(&oldmask, BITS_PER_LONG);
> > /* Allow lockless access for expedited grace periods. */
> > - smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */
> > + if (!(oldmask & mask))
> > + smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + 1); /* ^^^ */
> > rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
> > rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
> > rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
> > --
> > 2.23.0
> >