Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency bug
From: Paul E. McKenney
Date: Wed Mar 06 2024 - 22:06:20 EST
On Wed, Mar 06, 2024 at 06:43:42PM -0800, Linus Torvalds wrote:
> On Wed, 6 Mar 2024 at 18:29, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > TL;DR: Those ->rtort_pipe_count increments cannot run concurrently
> > with each other or any other update of that field, so that update-side
> > READ_ONCE() call is unnecessary and the update-side plain C-language
> > read is OK. The WRITE_ONCE() calls are there for the benefit of the
> > lockless read-side accesses to rtort_pipe_count.
>
> Ahh. Ok. That makes a bit more sense.
>
> So if that's the case, then the "updating side" should never use
> READ_ONCE, because there's nothing else to protect against.
>
> Honestly, this all makes me think that we'd be *much* better off
> showing the real "handoff" with smp_store_release() and
> smp_load_acquire().
>
> IOW, something like this (TOTALLY UNTESTED!) patch, perhaps?
>
> And please note that this patch is not only untested, it really is a
> very handwavy patch.
>
> I'm sending it as a patch just because it's a more precise way of
> saying "I think the writers and readers could use the store-release ->
> load-acquire not just to avoid any worries about accessing things
> once, but also as a way to show the directional 'flow' of the data".
>
> I dunno.
Thank you for looking at this!
I will look carefully at this, but the reason I didn't do it this way
to begin with is that I did not want false negatives that let weak-memory
RCU bugs escape unnoticed due to that synchronization and its overhead.
Of course on x86, that synchronization is (nearly) free.
Thanx, Paul
> Linus
> kernel/rcu/rcutorture.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 7567ca8e743c..60b74df3eae2 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -461,12 +461,12 @@ rcu_torture_pipe_update_one(struct rcu_torture *rp)
> WRITE_ONCE(rp->rtort_chkp, NULL);
> smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire().
> }
> - i = READ_ONCE(rp->rtort_pipe_count);
> + i = rp->rtort_pipe_count;
> if (i > RCU_TORTURE_PIPE_LEN)
> i = RCU_TORTURE_PIPE_LEN;
> atomic_inc(&rcu_torture_wcount[i]);
> - WRITE_ONCE(rp->rtort_pipe_count, i + 1);
> - if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
> + smp_store_release(&rp->rtort_pipe_count, ++i);
> + if (i >= RCU_TORTURE_PIPE_LEN) {
> rp->rtort_mbtest = 0;
> return true;
> }
> @@ -1408,8 +1408,7 @@ rcu_torture_writer(void *arg)
> if (i > RCU_TORTURE_PIPE_LEN)
> i = RCU_TORTURE_PIPE_LEN;
> atomic_inc(&rcu_torture_wcount[i]);
> - WRITE_ONCE(old_rp->rtort_pipe_count,
> - old_rp->rtort_pipe_count + 1);
> + smp_store_release(&old_rp->rtort_pipe_count, ++i);
>
> // Make sure readers block polled grace periods.
> if (cur_ops->get_gp_state && cur_ops->poll_gp_state) {
> @@ -1991,7 +1990,7 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp, long myid)
> rcu_torture_reader_do_mbchk(myid, p, trsp);
> rtrsp = rcutorture_loop_extend(&readstate, trsp, rtrsp);
> preempt_disable();
> - pipe_count = READ_ONCE(p->rtort_pipe_count);
> + pipe_count = smp_load_acquire(&p->rtort_pipe_count);
> if (pipe_count > RCU_TORTURE_PIPE_LEN) {
> /* Should not happen, but... */
> pipe_count = RCU_TORTURE_PIPE_LEN;