Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock

From: Will Deacon
Date: Mon May 11 2020 - 11:58:19 EST


On Sat, May 09, 2020 at 02:36:54PM -0700, Paul E. McKenney wrote:
> On Sat, May 09, 2020 at 12:53:38PM -0400, Qian Cai wrote:
> >
> >
> > > On May 9, 2020, at 12:12 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > Ah, and I forgot to ask. Why "if (data_race(prev->next == node)" instead
> > > of "if (data_race(prev->next) == node"?
> >
> > I think the one you suggested is slightly better to point out the exact race. Do you want me to resend or you could squash it instead?
>
> The patch was still at the top of my stack, so I just amended it. Here
> is the updated version.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit 13e69ca01ce1621ce74248bda86cfad47fa5a0fa
> Author: Qian Cai <cai@xxxxxx>
> Date: Tue Feb 11 08:54:15 2020 -0500
>
> locking/osq_lock: Annotate a data race in osq_lock
>
> The prev->next pointer can be accessed concurrently as noticed by KCSAN:
>
> write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
> osq_lock+0x25f/0x350
> osq_wait_next at kernel/locking/osq_lock.c:79
> (inlined by) osq_lock at kernel/locking/osq_lock.c:185
> rwsem_optimistic_spin
> <snip>
>
> read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
> osq_lock+0x196/0x350
> osq_lock at kernel/locking/osq_lock.c:157
> rwsem_optimistic_spin
> <snip>
>
> Since the write only stores NULL to prev->next and the read tests if
> prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> shattered, the code is still working correctly. Thus, mark it as an
> intentional data race using the data_race() macro.
>
> Signed-off-by: Qian Cai <cai@xxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
>
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 1f77349..1de006e 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> */
>
> for (;;) {
> - if (prev->next == node &&
> + /*
> + * cpu_relax() below implies a compiler barrier which would
> + * prevent this comparison being optimized away.
> + */
> + if (data_race(prev->next) == node &&
> cmpxchg(&prev->next, node, NULL) == node)
> break;

I'm fine with the data_race() placement, but I don't find the comment
very helpful. We assign the result of a READ_ONCE() to 'prev' in the
loop, so I don't think that the cpu_relax() is really relevant.

The reason we don't need READ_ONCE() here is because if we race with
the writer then either we'll go round the loop again after accidentally
thinking prev->next != node, or we'll erroneously attempt the cmpxchg()
because we thought they were equal and that will fail.

Make sense?

Will