Re: Control Dependencies vs C Compilers

From: Peter Zijlstra
Date: Wed Oct 07 2020 - 05:33:02 EST


On Tue, Oct 06, 2020 at 11:20:01PM +0200, Florian Weimer wrote:
> * Peter Zijlstra:
>
> > Our Documentation/memory-barriers.txt has a Control Dependencies section
> > (which I shall not replicate here for brevity) which lists a number of
> > caveats. But in general the work-around we use is:
> >
> > x = READ_ONCE(*foo);
> > if (x > 42)
> > WRITE_ONCE(*bar, 1);
> >
> > Where READ/WRITE_ONCE() cast the variable volatile. The volatile
> > qualifier dissuades the compiler from assuming it knows things and we
> > then hope it will indeed emit the branch like we'd expect.
> >
> >
> > Now, hoping the compiler generates correct code is clearly not ideal and
> > very dangerous indeed. Which is why my question to the compiler folks
> > assembled here is:
> >
> > Can we get a C language extention for this?
>
> For what exactly?

A branch that cannot be optimized away and prohibits lifting stores
over. One possible suggestion would be allowing the volatile keyword as
a qualifier to if.

x = *foo;
volatile if (x > 42)
*bar = 1;

This would tell the compiler that the condition is special in that it
must emit a conditional branch instruction and that it must not lift
stores (or sequence points) over it.

> Do you want a compiler that never simplifies conditional expressions
> (like some people want compilers that never re-associate floating point
> operations)?

No. I'm fine with optimizing things in general, I just want to be able
to control/limit it for a few specific cases.

> > And while we have a fair number (and growing) existing users of this in
> > the kernel, I'd not be adverse to having to annotate them.
>
> But not using READ_ONCE and WRITE_ONCE?

I'm OK with READ_ONCE(), but the WRITE_ONCE() doesn't help much, if
anything. The compiler is always allowed to lift stores, regardless of
the qualifiers used.

> I think in GCC, they are called __atomic_load_n(foo, __ATOMIC_RELAXED)
> and __atomic_store_n(foo, __ATOMIC_RELAXED). GCC can't optimize relaxed
> MO loads and stores because the C memory model is defective and does not
> actually guarantee the absence of out-of-thin-air values (a property it
> was supposed to have).

AFAIK people want to get that flaw in the C memory model fixed (which to
me seemd like a very good idea).

Also, AFAIK the compiler would be allowed to lift __atomic_store_n(foo,
__ATOMIC_RELAXED) out of a branch.

> A different way of annotating this would be a variant of _Atomic where
> plain accesses have relaxed MO, not seq-cst MO.

So Linux isn't going to use _Atomic, we disagree with the C memory model
too much. Also, volatile is perfectly sufficient for things.

I know there's a bunch of people in the C committee that want to get rid
of volatile, but that's just not going to happen in the real world,
there's too much volatile out there.

More to the point, Linux already relies on this without the later stores
being annotated, and it works because lifting those stores just really
doesn't make sense (and it's further constrained by sequence points,
although I'm not sure what, if anything, the effect of LTO optimization
is on sequence points -- inline for example removes sequence points,
which is sometimes scary as heck).