Re: [PATCH RFC tip/core/rcu 12/15] lib/assoc_array: Remove smp_read_barrier_depends()

From: Dmitry Vyukov
Date: Wed Oct 11 2017 - 16:00:12 EST


On Wed, Oct 11, 2017 at 8:56 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Oct 11, 2017 at 11:43 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>
>> A long thread and I lost track somewhat, but, yes, from KTSAN (data
>> race detector) point of view we will need a way to understand when
>> things are ordered due to data and control dependencies
>> (i.e.effectively acquire but only wrt the dependent stuff).
>> There is a logic level and physical level, it's perfectly fine if on
>> physical level all these _DEP/_CTRL variants are no-op (the same as
>> READ_ONCE, at least on todays archs with todays compilers). But on
>> logical level they represent a well defined, meaningful thing.
>
> No, they do not.
>
> Do you believe in fairies and Santa Claus?
>
> Because quite frankly, the likelihood of either of those being true is
> _way_ higher than the likelihood of any normal human ever getting
> those things right.
>
> So asking a programmer to annotate whether two memory accesses have a
> data dependency or a control dependency is completely inappropriate.
> You won't get people understanding it to begin with, much less then
> figure out subtle things like whether a control dependency is an
> actual branch, or might be turned into a data dependency through
> select, or whatever.
>
> We've had really smart people who wrote core code that couldn't get it
> right, and that weren't sure if a control dependency was really
> guaranteed or not.
>
> That is *exactly* the kinds of thing that _automation_ should handle.
> Not some human. Figure the data/control dependencies out from the
> code, not from some logical level.
>
> I saw the contortions that the ISO C people tried to go through just
> to describe control and data dependencies. It was awful. It should
> have never been described on that kind of level to begin with, when it
> would have been much easier to just describe it to compiler people as
> "this is a consume relationship". The same rules apply here. Don't
> make it about some high-level thing and humans annontating things.
> Make it about the actual generated code.


First of all, I am not asking for a big overhaul and nothing changes
wrt codegen (i.e. the code works the same way it used to work).
If/when it is done, it will be guided by the race detector. It is
similar to enabling a new compiler warning and cleaning up the code
base incrementally. Compiler warnings don't affect code behavior, and
one doesn't need to proactively guess where they will fire (a compiler
warnings points you to the line what compiler does not like).

Second, we would very much like to just analyze the code as is. And in
fact we gone to great lengths teaching it about standalone memory
barriers (rmb/wmb), per-cpu variables, seqlocks, rcu, etc (none of
that we had in user-space). But this is just where we need help
(basically the same problem you mentioned with ISO C, it's effectively
not defineable/analyzable by compiler).

As far as I remember we added just few of READ_ONCE_CTRL (<5) to get
the prototype running. So it's not that we are talking about
hundreds/thousands of them. And again the tool will point to the
places it does not like. So I am just asking for a practical tradeoff:
few annotations in exchange for hundreds/thousands of bugs killed.

I did not want to revive this discussion right now, we already had it
and decided that we need a clear story for READ_ONCE_CTRL first. Our
plan is to revive the tool, find more real bugs and show what are the
actual code changes required for the tool. Hope we can get to some
compromise.