Re: LKMM litmus test for Roman Penyaev's rcu-rr

From: Linus Torvalds
Date: Wed May 30 2018 - 18:01:21 EST


On Wed, May 30, 2018 at 2:08 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> Indeed. The very first line Linus quoted in his first reply to me
> (elided above) was:
>
> Putting this into herd would be extremely difficult, if not impossible,
> because it involves analyzing code that was not executed.
>
> It should be clear from this that I was talking about herd. Not gcc or
> real hardware.

So what does herd actually work on? The source code or the executable,
or a trace?

I found the herd paper, but I'm on the road helping my daughter in
college move, and I don't have the background to skim the paper
quickly and come up with the obvious answer, so I'l just ask.

Because I really think that from our memory model standpoint, we
really do have the rule that

load -> cond -> store

is ordered - even if the store address and store data is in no way
dependent on the load. The only thing that matters is that there's a
conditional that is dependent on the load in between the load and the
store.

Note that this is *independent* of how you get to the store. It
doesn't matter if it's a fallthrough conditional jump or a taken
conditional jump, or whether there is a joining.

The only thing that *does* matter is whether the conditional can be
turned into a "select" statement. If the conditional can be turned
into a data dependency, then the ordering goes away. That is why it
was relevant whether "C" contained a barrier or not (it doesn't even
have to be a *memory* barrier, it just has to be a barrier for code
generation).

Note that the "C doesn't even have to have a memory barrier" is
important, because the orderin from load->cond->store doesn't actually
have anything to do with any memory ordering imposed by C, it's much
more fundamental than that.

> Preserving the order of volatile accesses isn't sufficient. The
> compiler is still allowed to translate
>
> r1 = READ_ONCE(x);
> if (r1) {
> ...
> }
> WRITE_ONCE(y, r2);
>
> into something resembling
>
> r1 = READ_ONCE(x);
> WRITE_ONCE(y, r2);
> if (r1) {
> ...
> }

Correct.

What matter is that the code in C (now called "..." above ;^) has a
build-time barrier that means that the compiler cannot do that.

That barrier can be pretty much anything. The important part is
literally that there's a guarantee that the write cannot be migrated
below the conditional.

But I don't know what level 'herd' works on. If it doesn't see
compiler barriers (eg our own "barrier()" macro that literally is just
that), only sees the generated code, then it really has no other
information than what he compiler _happened_ to do - it doesn't know
if the compiler did the store after the conditional because it *had*
to do so, or whether it was just a random instruction scheduling
decision.

Linus