Re: [RFC] LKMM: Add volatile_if()

From: Alan Stern
Date: Fri Sep 24 2021 - 15:52:33 EST

On Fri, Sep 24, 2021 at 02:38:58PM -0400, Mathieu Desnoyers wrote:
> Hi,
> Following the LPC2021 BoF about control dependency, I re-read the kernel
> documentation about control dependency, and ended up thinking that what
> we have now is utterly fragile.
> Considering that the goal here is to prevent the compiler from being able to
> optimize a conditional branch into something which lacks the control
> dependency, while letting the compiler choose the best conditional
> branch in each case, how about the following approach ?
> #define ctrl_dep_eval(x) ({ BUILD_BUG_ON(__builtin_constant_p((_Bool) x)); x; })
> #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm volatile goto ("" : : : "cc", "memory" : l_dummy); (x); })
> #define ctrl_dep_if(x) if ((ctrl_dep_eval(x) && ctrl_dep_emit_loop(1)) || ctrl_dep_emit_loop(0))
> The idea is to forbid the compiler from considering the two branches as
> identical by adding a dummy loop in each branch with an empty asm goto.
> Considering that the compiler should not assume anything about the
> contents of the asm goto (it's been designed so the generated assembly
> can be modified at runtime), then the compiler can hardly know whether
> each branch will trigger an infinite loop or not, which should prevent
> unwanted optimisations.
> With this approach, the following code now keeps the control dependency:
> z = READ_ONCE(var1);
> ctrl_dep_if (z)
> WRITE_ONCE(var2, 5);
> else
> WRITE_ONCE(var2, 5);
> And the ctrl_dep_eval() checking the constant triggers a build error
> for:
> y = READ_ONCE(var1);
> ctrl_dep_if (y % 1)
> WRITE_ONCE(var2, 5);
> else
> WRITE_ONCE(var2, 6);
> Which is good to have to ensure the compiler don't end up removing the
> conditional branch because the resulting evaluation ends up evaluating a
> constant.
> Thoughts ?

As I remember the earlier discussion, Linus felt that the kernel doesn't
really need any sort of explicit control dependency (although we called
it "volatile if"). In many cases there is an actual semantic
dependency, so it doesn't matter what the compiler does -- the hardware
will enforce the actual dependency. In other cases, we can work around
the issue by using acquire loads or release stores.

In fact, Linus's biggest wish was to have a weak form of compiler
barrier, one which would block the compiler from reordering accesses
across the barrier but wouldn't invalidate the compiler's knowledge
about the values of earlier reads (which barrier() would do).

Alan Stern