Re: [PATCH -next v2] locking/osq_lock: annotate a data race in osq_lock

From: Paul E. McKenney
Date: Sat May 09 2020 - 00:33:14 EST


On Fri, May 08, 2020 at 04:59:05PM -0400, Qian Cai wrote:
>
>
> > On Feb 11, 2020, at 8:54 AM, Qian Cai <cai@xxxxxx> wrote:
> >
> > prev->next could be accessed concurrently as noticed by KCSAN,
> >
> > write (marked) to 0xffff9d3370dbbe40 of 8 bytes by task 3294 on cpu 107:
> > osq_lock+0x25f/0x350
> > osq_wait_next at kernel/locking/osq_lock.c:79
> > (inlined by) osq_lock at kernel/locking/osq_lock.c:185
> > rwsem_optimistic_spin
> > <snip>
> >
> > read to 0xffff9d3370dbbe40 of 8 bytes by task 3398 on cpu 100:
> > osq_lock+0x196/0x350
> > osq_lock at kernel/locking/osq_lock.c:157
> > rwsem_optimistic_spin
> > <snip>
> >
> > Since the write only stores NULL to prev->next and the read tests if
> > prev->next equals to this_cpu_ptr(&osq_node). Even if the value is
> > shattered, the code is still working correctly. Thus, mark it as an
> > intentional data race using the data_race() macro.
> >
> > Signed-off-by: Qian Cai <cai@xxxxxx>
>
> Hmm, this patch has been dropped from linux-next from some reasons.
>
> Paul, can you pick this up along with KCSAN fixes?
>
> https://lore.kernel.org/lkml/1581429255-12542-1-git-send-email-cai@xxxxxx/

I have queued it on -rcu, but it is too late for v5.8 via the -rcu tree,
so this is v5.9 at the earliest. Plus I would need an ack from one of
the locking folks.

Thanx, Paul

> > ---
> >
> > v2: insert some code comments.
> >
> > kernel/locking/osq_lock.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > index 1f7734949ac8..f733bcd99e8a 100644
> > --- a/kernel/locking/osq_lock.c
> > +++ b/kernel/locking/osq_lock.c
> > @@ -154,7 +154,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > */
> >
> > for (;;) {
> > - if (prev->next == node &&
> > + /*
> > + * cpu_relax() below implies a compiler barrier which would
> > + * prevent this comparison being optimized away.
> > + */
> > + if (data_race(prev->next == node) &&
> > cmpxchg(&prev->next, node, NULL) == node)
> > break;
> >
> > --
> > 1.8.3.1
> >
>