Re: [RFC] LKMM: Add volatile_if()

From: Linus Torvalds
Date: Fri Jun 04 2021 - 15:10:49 EST


On Fri, Jun 4, 2021 at 11:27 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> volatile_if (READ_ONCE(*x) * 0 + READ_ONCE(*y))
> WRITE_ONCE(*z, 42);
>
> where there is no ordering between *x and *z.

I wouldn't worry about it.

I think a compiler is allowed to optimize away stupid code.

I get upset when a compiler says "oh, that's undefined, so I will
ignore the obvious meaning of it", but that's a different thing
entirely.

I really wish that the C standards group showed some spine, and said
"there is no undefined, there is only implementation-defined". That
would solve a *lot* of problems.

But I also realize that will never happen. Because "spine" and "good
taste" is not something that I've ever heard of happening in an
industry standards committee.

Side note: it is worth noting that my version of "volatile_if()" has
an added little quirk: it _ONLY_ orders the stuff inside the
if-statement.

I do think it's worth not adding new special cases (especially that
"asm goto" hack that will generate worse code than the compiler could
do), but it means that

x = READ_ONCE(ptr);
volatile_if (x > 0)
WRITE_ONCE(*z, 42);

has an ordering, but if you write it as

x = READ_ONCE(ptr);
volatile_if (x <= 0)
return;
WRITE_ONCE(*z, 42);

then I could in theory see teh compiler doing that WRITE_ONCE() as
some kind of non-control dependency.

That said, I don't actually see how the compiler could do anything
that actually broke the _semantics_ of the code. Yes, it could do the
write using a magical data dependency on the conditional and turning
it into a store on a conditional address instead (before doing the
branch), but honestly, I don't see how that would actually break
anything.

So this is more of a "in theory, the two sides are not symmetric". The
"asm volatile" in a barrier() will force the compiler to generate the
branch, and the memory clobber in barrier() will most certainly force
any stores inside the "volatile_if()" to be after the branch.

But because the memory clobber is only inside the if-statement true
case, the false case could have the compiler migrate any code in that
false thing to before the if.

Again, semantics do matter, and I don't see how the compiler could
actually break the fundamental issue of "load->conditional->store is a
fundamental ordering even without memory barriers because of basic
causality", because you can't just arbitrarily generate speculative
stores that would be visible to others.

But at the same time, that's *such* a fundamental rule that I really
am intrigued why people think "volatile_if()" is needed in reality (as
opposed to some "in theory, the compiler can know things that are
unknowable thanks to a magical oracle" BS argument)

Linus