Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency bug
From: Paul E. McKenney
Date: Thu Mar 07 2024 - 19:55:17 EST
On Thu, Mar 07, 2024 at 02:09:00PM -0800, Linus Torvalds wrote:
> On Thu, 7 Mar 2024 at 13:40, Julia Lawall <julia.lawall@xxxxxxxx> wrote:
> >
> > I tried the following:
> >
> > @@
> > expression x;
> > @@
> >
> > *WRITE_ONCE(x,<+...READ_ONCE(x)...+>)
> >
> > This gave a number of results, shown below. Let me know if some of them
> > are undesirable.
>
> Well, all the ones you list do look like garbage.
>
> That said, quite often the garbage does seem to be "we don't actually
> care about the result". Several of them look like statistics.
[ . . . ]
> > diff -u -p /home/julia/linux/kernel/rcu/tree.c /tmp/nothing/kernel/rcu/tree.c
> > --- /home/julia/linux/kernel/rcu/tree.c
> > +++ /tmp/nothing/kernel/rcu/tree.c
> > @@ -1620,8 +1620,6 @@ static void rcu_gp_fqs(bool first_time)
> > /* Clear flag to prevent immediate re-entry. */
> > if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) {
> > raw_spin_lock_irq_rcu_node(rnp);
> > - WRITE_ONCE(rcu_state.gp_flags,
> > - READ_ONCE(rcu_state.gp_flags) & ~RCU_GP_FLAG_FQS);
> > raw_spin_unlock_irq_rcu_node(rnp);
>
> This smells bad to me. The code is holding a lock, but apparently not
> one that protects gp_flags.
>
> And that READ_ONCE->WRITE_ONCE sequence can corrupt all the other flags.
>
> Maybe it's fine for some reason (that reason being either that the
> ONCE operations aren't actually needed at all, or because nobody
> *really* cares about the flags), but it smells.
>
> > @@ -1882,8 +1880,6 @@ static void rcu_report_qs_rsp(unsigned l
> > {
> > raw_lockdep_assert_held_rcu_node(rcu_get_root());
> > WARN_ON_ONCE(!rcu_gp_in_progress());
> > - WRITE_ONCE(rcu_state.gp_flags,
> > - READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS);
> > raw_spin_unlock_irqrestore_rcu_node(rcu_get_root(), flags);
>
> Same field, same lock held, same odd smelly pattern.
>
> > - WRITE_ONCE(rcu_state.gp_flags,
> > - READ_ONCE(rcu_state.gp_flags) | RCU_GP_FLAG_FQS);
> > raw_spin_unlock_irqrestore_rcu_node(rnp_old, flags);
>
> .. and again.
In all three cases, the updates are protected by the lock, so the
READ_ONCE() is unnecessary. I have queued a commit to remove the
READ_ONCE()s.
Thanks to both of you (Julia and Linus) for spotting these!
Thanx, Paul