Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()

From: Will Deacon
Date: Fri Apr 24 2020 - 13:39:42 EST


On Mon, Mar 30, 2020 at 04:13:15PM -0700, Paul E. McKenney wrote:
> On Tue, Mar 24, 2020 at 09:32:01PM +0000, Will Deacon wrote:
> > [mutt crashed while I was sending this; apologies if you receive it twice]
> >
> > On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote:
> > > On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > > > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > > > --- a/include/linux/list.h
> > > > > +++ b/include/linux/list.h
> > > > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > > > > */
> > > > > static inline int list_empty(const struct list_head *head)
> > > > > {
> > > > > - return READ_ONCE(head->next) == head;
> > > > > + return data_race(READ_ONCE(head->next) == head);
> > > > > }
> > > >
> > > > list_empty() isn't lockless safe, that's what we have
> > > > list_empty_careful() for.
> > >
> > > That thing looks like it could also use some READ_ONCE() sprinkled in...
> >
> > Crikey, how did I miss that? I need to spend some time understanding the
> > ordering there.
> >
> > So it sounds like the KCSAN splats relating to list_empty() and loosely
> > referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists
> > and hlists") are indicative of real bugs and we should actually restore
> > list_empty() to its former glory prior to 1658d35ead5d ("list: Use
> > READ_ONCE() when testing for empty lists"). Alternatively, assuming
> > list_empty_careful() does what it says on the tin, we could just make that
> > the default.
>
> The list_empty_careful() function (suitably annotated) returns false if
> the list is non-empty, including when it is in the process of becoming
> either empty or non-empty. It would be fine for the lockless use cases
> I have come across.

Hmm, I had a look at the implementation and I'm not at all convinced that
it's correct. First of all, the comment above it states:

* NOTE: using list_empty_careful() without synchronization
* can only be safe if the only activity that can happen
* to the list entry is list_del_init(). Eg. it cannot be used
* if another CPU could re-list_add() it.

but it seems that people disregard this note and instead use it as a
general-purpose lockless test, taking a lock and rechecking if it returns
non-empty. It would also mean we'd have to keep the WRITE_ONCE() in
INIT_LIST_HEAD, which is something that I've been trying to remove.

In the face of something like a concurrent list_add(); list_add_tail()
sequence, then the tearing writes to the head->{prev,next} pointers could
cause list_empty_careful() to indicate that the list is momentarily empty.

I've started looking at whether we can use a NULL next pointer to indicate
an empty list, which might allow us to kill the __list_del_clearprev() hack
at the same time, but I've not found enough time to really get my teeth into
it yet.

Will