Re: [RFC] LKMM: Add volatile_if()

From: Mathieu Desnoyers
Date: Fri Sep 24 2021 - 18:07:46 EST


----- On Sep 24, 2021, at 4:39 PM, Mathieu Desnoyers mathieu.desnoyers@xxxxxxxxxxxx wrote:

> ----- On Sep 24, 2021, at 3:55 PM, Segher Boessenkool segher@xxxxxxxxxxxxxxxxxxx
> wrote:
>
>> Hi!
>>
>> On Fri, Sep 24, 2021 at 02:38:58PM -0400, Mathieu Desnoyers wrote:
>>> Following the LPC2021 BoF about control dependency, I re-read the kernel
>>> documentation about control dependency, and ended up thinking that what
>>> we have now is utterly fragile.
>>>
>>> Considering that the goal here is to prevent the compiler from being able to
>>> optimize a conditional branch into something which lacks the control
>>> dependency, while letting the compiler choose the best conditional
>>> branch in each case, how about the following approach ?
>>>
>>> #define ctrl_dep_eval(x) ({ BUILD_BUG_ON(__builtin_constant_p((_Bool)
>>> x)); x; })
>>> #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm volatile goto
>>> ("" : : : "cc", "memory" : l_dummy); (x); })
>>> #define ctrl_dep_if(x) if ((ctrl_dep_eval(x) && ctrl_dep_emit_loop(1))
>>> || ctrl_dep_emit_loop(0))
>>
>> [The "cc" clobber only pessimises things: the asm doesn't actually
>> clobber the default condition code register (which is what "cc" means),
>> and you can have conditional branches using other condition code
>> registers, or on other registers even (general purpose registers is
>> common.]
>
> I'm currently considering removing both "memory" and "cc" clobbers from
> the asm goto.
>
>>
>>> The idea is to forbid the compiler from considering the two branches as
>>> identical by adding a dummy loop in each branch with an empty asm goto.
>>> Considering that the compiler should not assume anything about the
>>> contents of the asm goto (it's been designed so the generated assembly
>>> can be modified at runtime), then the compiler can hardly know whether
>>> each branch will trigger an infinite loop or not, which should prevent
>>> unwanted optimisations.
>>
>> The compiler looks if the code is identical, nothing more, nothing less.
>> There are no extra guarantees. In principle the compiler could see both
>> copies are empty asms looping to self, and so consider them equal.
>
> I would expect the compiler not to attempt combining asm goto based on their
> similarity because it has been made clear starting from the original
> requirements
> from the kernel community to the gcc developers that one major use-case of asm
> goto involves self-modifying code (patching between nops and jumps).
>
> If this happens to be a real possibility, then we may need to work-around this
> for
> other uses of asm goto as well.

Now that I page back this stuff into my brain (I last looked at it in details some
12 years ago), I recall that letting compilers combine asm goto statements which
happen to match CSE was actually something we wanted to permit, because we don't care
about editing the nops into jumps for each individual asm goto if they happen
to have the same effect when modified.

>
> If there is indeed a scenario where the compiler can combine similar asm goto
> statements,
> then I suspect we may want to emit unique dummy code in the assembly which gets
> placed in a
> discarded section, e.g.:
>
> #define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm goto (
> \
> ".pushsection .discard.ctrl_dep\n\t" \
> ".long " __stringify(__COUNTER__) "\n\t" \
> ".popsection\n\t" \
> "" : : : : l_dummy); (x); })
>

So I think your point is very much valid: we need some way to make the content of the asm goto
different between the two branches. I think the __COUNTER__ approach is overkill though:
we don't care about making each of the asm goto loop unique within the entire file;
we just don't want them to match between the two legs of the branch.

So something like this should be enough:

#define ctrl_dep_emit_loop(x) ({ __label__ l_dummy; l_dummy: asm goto ( \
".pushsection .discard.ctrl_dep\n\t" \
".long " __stringify(x) "\n\t" \
".popsection\n\t" \
"" : : : : l_dummy); (x); })

So we emit respectively a 0 and 1 into the discarded section.

Thoughts ?

Thanks,

Mathieu


> But then a similar trick would be needed for jump labels as well.
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com