Re: Confused about hlist_unhashed_lockless()

From: Paul E. McKenney
Date: Mon Feb 03 2020 - 16:16:11 EST


On Mon, Feb 03, 2020 at 06:27:37PM +0000, Will Deacon wrote:
> Hi Paul,
>
> On Mon, Feb 03, 2020 at 09:29:47AM -0800, Paul E. McKenney wrote:
> > On Mon, Feb 03, 2020 at 04:02:28PM +0000, Will Deacon wrote:
> > > On Mon, Feb 03, 2020 at 07:58:39AM -0800, Paul E. McKenney wrote:
> > > > On Mon, Feb 03, 2020 at 03:45:54PM +0000, David Laight wrote:
> > > > > From: Eric Dumazet
> > > > > > Sent: 31 January 2020 18:53
> > > > > >
> > > > > > On Fri, Jan 31, 2020 at 10:48 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > >
> > > > > > > This is nice, now with have data_race()
> > > > > > >
> > > > > > > Remember these patches were sent 2 months ago, at a time we were
> > > > > > > trying to sort out things.
> > > > > > >
> > > > > > > data_race() was merged a few days ago.
> > > > > >
> > > > > > Well, actually data_race() is not there yet anyway.
> > > > >
> > > > > Shouldn't it be NO_DATA_RACE() ??
> > > >
> > > > No, because you use data_race() when there really are data races, but you
> > > > want KCSAN to ignore them. For example, diagnostic code that doesn't
> > > > participate in the actual concurrency design and that doesn't run all
> > > > that often might use data_race(). For another example, if a developer
> > > > knew that data races existed, but that the compiler could not reasonably
> > > > do anything untoward with those data races, that developer might well
> > > > choose to use data_race() instead of READ_ONCE(). Especially if the
> > > > access in question was on a fastpath where helpful compiler optimizations
> > > > would be prohibited by use of READ_ONCE().
> > >
> > > Yes, and in this particular case I think we can remove some WRITE_ONCE()s
> > > from the non-RCU hlist code too (similarly for hlist_nulls).
> >
> > Quite possibly, but we should take them case by case. READ_ONCE()
> > really does protect against some optimizations, while data_race() does
> > not at all.
>
> Agreed, and I plan to send patches for review so we can discuss them in
> more detail then.

Looking forward to seeing them!

> > But yes, in some cases you want to -avoid- using READ_ONCE() and
> > WRITE_ONCE() so that KCSAN can do its job. For example, given a per-CPU
> > variable that is only supposed to be accessed from the corresponding CPU
> > except for reads by diagnostic code, you should have the main algorithm
> > use plain C-language reads and writes, and have the diagnostic code
> > use data_race(). This allows KCSAN to correctly flag bugs that access
> > this per-CPU variable off-CPU while leaving the diagnostic code alone.
>
> Yes, and in a similar vein I think the WRITE_ONCE() additions to the hlist
> code may hide unintentional racy access to the hlist where I would argue
> that the correct behaviour is either to acknowledge the data race (like the
> timer code) or to use the RCU variant. The problem with what's currently in
> mainline is that it reads a bit like the non-RCU hlist is directly usable as
> a lock-free list implementation, which really isn't the case.

Fair point. So your thought is that the RCU variant (or something
similar to it) is used in the lockless case, and that KCSAN complains
about lockless use of the non-READ_ONCE() interface? If so, that seems
like it might work quite well.

> > Seem reasonable?
>
> It does to me, but we should probably try to apply this a bit more
> consistently in patch review. Adding {READ,WRITE}_ONCE() until the
> sanitiser shuts up is easy, but picking that apart later on is a real
> challenge.

Agreed. It does seem like early days for laying out a reasonable set
of use cases, but we do have to start somewhere. My hope is that the
resulting automated review of concurency will be more than worth the
effort. Hope springs eternal and all that... :-)

Thanx, Paul