Re: Confused about hlist_unhashed_lockless()

From: Eric Dumazet
Date: Fri Jan 31 2020 - 13:55:16 EST


On Fri, Jan 31, 2020 at 10:52 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Fri, Jan 31, 2020 at 08:48:05AM -0800, Eric Dumazet wrote:
> > On Fri, Jan 31, 2020 at 8:43 AM Will Deacon <will@xxxxxxxxxx> wrote:
> > >
> > > Hi folks,
> > >
> > > I just ran into c54a2744497d ("list: Add hlist_unhashed_lockless()")
> > > but I'm a bit confused about what it's trying to achieve. It also seems
> > > to have been merged without any callers (even in -next) -- was that
> > > intentional?
> > >
> > > My main source of confusion is the lack of memory barriers. For example,
> > > if you look at the following pair of functions:
> > >
> > >
> > > static inline int hlist_unhashed_lockless(const struct hlist_node *h)
> > > {
> > > return !READ_ONCE(h->pprev);
> > > }
> > >
> > > static inline void hlist_add_before(struct hlist_node *n,
> > > struct hlist_node *next)
> > > {
> > > WRITE_ONCE(n->pprev, next->pprev);
> > > WRITE_ONCE(n->next, next);
> > > WRITE_ONCE(next->pprev, &n->next);
> > > WRITE_ONCE(*(n->pprev), n);
> > > }
> > >
> > >
> > > Then running these two concurrently on the same node means that
> > > hlist_unhashed_lockless() doesn't really tell you anything about whether
> > > or not the node is reachable in the list (i.e. there is another node
> > > with a next pointer pointing to it). In other words, I think all of
> > > these outcomes are permitted:
> > >
> > > hlist_unhashed_lockless(n) n reachable in list
> > > 0 0 (No reordering)
> > > 0 1 (No reordering)
> > > 1 0 (No reordering)
> > > 1 1 (Reorder first and last WRITE_ONCEs)
> > >
> > > So I must be missing some details about the use-case here. Please could
> > > you enlighten me? The RCU implementation permits only the first three
> > > outcomes afaict, why not use that and leave non-RCU hlist as it was?
> >
> > I guess the following has been lost :
>
> 4d3d2ae81afd ("timer: Use hlist_unhashed_lockless() in timer_pending()")
> in -rcu, slated for not this merge window but the next one. And
> including the changes in your later email, Eric. Please see below
> and let me know whether you are OK with it.
>
> Thanx, Paul

Well, it seems we only have to wait for data_race() being available, right ?

Then push a patch using data_race() instead of READ_ONCE() thing.