Re: [PATCH] list: Prevent compiler reloads inside 'safe' list iteration

From: Marco Elver
Date: Tue Mar 10 2020 - 10:10:09 EST


On Tue, 10 Mar 2020 at 13:50, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Tue, Mar 10, 2020 at 12:23:34PM +0000, David Laight wrote:
> > From: Chris Wilson
> > > Sent: 10 March 2020 11:50
> > >
> > > Quoting David Laight (2020-03-10 11:36:41)
> > > > From: Chris Wilson
> > > > > Sent: 10 March 2020 09:21
> > > > > Instruct the compiler to read the next element in the list iteration
> > > > > once, and that it is not allowed to reload the value from the stale
> > > > > element later. This is important as during the course of the safe
> > > > > iteration, the stale element may be poisoned (unbeknownst to the
> > > > > compiler).
> > > >
> > > > Eh?
> > > > I thought any function call will stop the compiler being allowed
> > > > to reload the value.
> > > > The 'safe' loop iterators are only 'safe' against called
> > > > code removing the current item from the list.
> > > >
> > > > > This helps prevent kcsan warnings over 'unsafe' conduct in releasing the
> > > > > list elements during list_for_each_entry_safe() and friends.
> > > >
> > > > Sounds like kcsan is buggy ????
>
> Adding Marco on CC for his thoughts.

I'd have to see a stack-trace with line-numbers.

But keep in mind what KCSAN does, which is report "data races". If the
KCSAN report showed 2 accesses, where one of them was a *plain* read
(and the other a write), then it's a valid data race (per LKMM's
definition). It seems this was the case here.

As mentioned, the compiler is free to transform plain accesses in
various concurrency-unfriendly ways.

FWIW, for writes we're already being quite generous, in that plain
aligned writes up to word-size are assumed to be "atomic" with the
default (conservative) config, i.e. marking such writes is optional.
Although, that's a generous assumption that is not always guaranteed
to hold (https://lore.kernel.org/lkml/20190821103200.kpufwtviqhpbuv2n@willie-the-truck/).

If there is code for which you prefer not to see KCSAN reports at all,
you are free to disable them with KCSAN_SANITIZE_file.o := n

Thanks,
-- Marco

> > > The warning kcsan gave made sense (a strange case where the emptying the
> > > list from inside the safe iterator would allow that list to be taken
> > > under a global mutex and have one extra request added to it. The
> > > list_for_each_entry_safe() should be ok in this scenario, so long as the
> > > next element is read before this element is dropped, and the compiler is
> > > instructed not to reload the element.
> >
> > Normally the loop iteration code has to hold the mutex.
> > I guess it can be released inside the loop provided no other
> > code can ever delete entries.
> >
> > > kcsan is a little more insistent on having that annotation :)
> > >
> > > In this instance I would say it was a false positive from kcsan, but I
> > > can see why it would complain and suspect that given a sufficiently
> > > aggressive compiler, we may be caught out by a late reload of the next
> > > element.
> >
> > If you have:
> > for (; p; p = next) {
> > next = p->next;
> > external_function_call(void);
> > }
> > the compiler must assume that the function call
> > can change 'p->next' and read it before the call.
>
> That "must assume" is a statement of current compiler technology.
> Given the progress over the past forty years, I would not expect this
> restriction to hold forever. Yes, we can and probably will get the
> compiler implementers to give us command-line flags to suppress global
> analysis. But given the progress in compilers that I have seen over
> the past 4+ decades, I would expect that the day will come when we won't
> want to be using those command-line flags.
>
> But if you want to ignore KCSAN's warnings, you are free to do so.
>
> > Is this a list with strange locking rules?
> > The only deletes are from within the loop.
> > Adds and deletes are locked.
> > The list traversal isn't locked.
> >
> > I suspect kcsan bleats because it doesn't assume the compiler
> > will use a single instruction/memory operation to read p->next.
> > That is just stupid.
>
> Heh! If I am still around, I will ask you for your evaluation of the
> above statement in 40 years. Actually, 10 years will likely suffice. ;-)
>
> Thanx, Paul