Re: [PATCH 01/16] rcu/nocb: Fix potential missed nocb_timer rearm

From: Frederic Weisbecker
Date: Thu Jan 28 2021 - 16:24:49 EST


On Thu, Jan 28, 2021 at 10:48:34AM -0800, Paul E. McKenney wrote:
> On Thu, Jan 28, 2021 at 06:12:07PM +0100, Frederic Weisbecker wrote:
> > The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes
> > the pending "nocb_timer" (note they are not the same timers) for the
> > given rdp without resetting the matching state stored in nocb_defer
> > wakeup.
> >
> > As a result, a future call_rcu() on that rdp may be fooled and think the
> > timer is armed when it's not, missing a deferred nocb_gp wakeup.
> >
> > Fix this with resetting rdp->nocb_defer_wakeup when we disarm the timer.
> >
> > Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing)
> > Cc: Stable <stable@xxxxxxxxxxxxxxx>
> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx>
> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> > Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > ---
> > kernel/rcu/tree_plugin.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 7e33dae0e6ee..a44f80d7661b 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1705,6 +1705,8 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> > rcu_nocb_unlock_irqrestore(rdp, flags);
> > return false;
> > }
> > +
> > + rdp->nocb_defer_wakeup = RCU_NOCB_WAKE_NOT;
>
> Given this change, does it make sense to remove the
> setting of ->nocb_defer_wakeup to RCU_NOCB_WAKE_NOT from the
> do_nocb_deferred_wakeup_common() function?

I do it later in "[PATCH 09/16] rcu/nocb: Merge nocb_timer to the rdp leader"

> Does the above assignment need
> to be WRITE_ONCE(), in other words, are all reads of ->nocb_defer_wakeup
> done with either ->nocb_lock or ->nocb_gp_lock held? (I do not believe
> that this is the case.)

Ah indeed it should probably be done with WRITE_ONCE() because it's read
locklessly on many places.

Thanks.

>
> Thanx, Paul
>
> > del_timer(&rdp->nocb_timer);
> > rcu_nocb_unlock_irqrestore(rdp, flags);
> > raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
> > --
> > 2.25.1
> >