Re: [RFC] LKMM: Add volatile_if()

From: Mathieu Desnoyers
Date: Fri Sep 24 2021 - 16:22:43 EST

----- On Sep 24, 2021, at 3:52 PM, Alan Stern stern@xxxxxxxxxxxxxxxxxxx wrote:

> On Fri, Sep 24, 2021 at 02:38:58PM -0400, Mathieu Desnoyers wrote:
>> Hi,
>> Following the LPC2021 BoF about control dependency, I re-read the kernel
>> documentation about control dependency, and ended up thinking that what
>> we have now is utterly fragile.
>> Considering that the goal here is to prevent the compiler from being able to
>> optimize a conditional branch into something which lacks the control
>> dependency, while letting the compiler choose the best conditional
>> branch in each case, how about the following approach ?
>> #define ctrl_dep_eval(x) ({ BUILD_BUG_ON(__builtin_constant_p((_Bool)
>> x)); x; })
>> #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm volatile goto
>> ("" : : : "cc", "memory" : l_dummy); (x); })
>> #define ctrl_dep_if(x) if ((ctrl_dep_eval(x) && ctrl_dep_emit_loop(1))
>> || ctrl_dep_emit_loop(0))
>> The idea is to forbid the compiler from considering the two branches as
>> identical by adding a dummy loop in each branch with an empty asm goto.
>> Considering that the compiler should not assume anything about the
>> contents of the asm goto (it's been designed so the generated assembly
>> can be modified at runtime), then the compiler can hardly know whether
>> each branch will trigger an infinite loop or not, which should prevent
>> unwanted optimisations.
>> With this approach, the following code now keeps the control dependency:
>> z = READ_ONCE(var1);
>> ctrl_dep_if (z)
>> WRITE_ONCE(var2, 5);
>> else
>> WRITE_ONCE(var2, 5);
>> And the ctrl_dep_eval() checking the constant triggers a build error
>> for:
>> y = READ_ONCE(var1);
>> ctrl_dep_if (y % 1)
>> WRITE_ONCE(var2, 5);
>> else
>> WRITE_ONCE(var2, 6);
>> Which is good to have to ensure the compiler don't end up removing the
>> conditional branch because the resulting evaluation ends up evaluating a
>> constant.
>> Thoughts ?
> As I remember the earlier discussion, Linus felt that the kernel doesn't
> really need any sort of explicit control dependency (although we called
> it "volatile if"). In many cases there is an actual semantic
> dependency, so it doesn't matter what the compiler does -- the hardware
> will enforce the actual dependency. In other cases, we can work around
> the issue by using acquire loads or release stores.

IMHO, having to chase down what the instruction selection does on every
architecture for every supported compiler for each control dependency in
order to confirm that the control dependency is still present in the resulting
assembly is fragile. If the kernel really doesn't need explicit control
dependency, then maybe the whole notion of "control dependency"-based
ordering should be removed in favor of explicit acquire loads/release stores
and barriers.

> In fact, Linus's biggest wish was to have a weak form of compiler
> barrier, one which would block the compiler from reordering accesses
> across the barrier but wouldn't invalidate the compiler's knowledge
> about the values of earlier reads (which barrier() would do).

Then maybe we could simply tweak the ctrl_dep_emit_loop() implementation and
remove the "memory" clobber. Then its only effect is to prevent the compiler
from knowing whether there is an infinite loop, thus preventing reordering
accesses across the conditional branch without requiring the compiler to
discard earlier reads:

#define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm_volatile_goto ("" : : : : l_dummy); (x); })

and for the records, the ctrl_dep_eval(x) implementation needs extra parentheses:

#define ctrl_dep_eval(x) ({ BUILD_BUG_ON(__builtin_constant_p((_Bool) (x))); (x); })



Mathieu Desnoyers
EfficiOS Inc.