Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs

From: Peter Zijlstra
Date: Thu Jun 27 2019 - 03:13:20 EST


On Wed, Jun 26, 2019 at 03:14:05PM -0700, Nick Desaulniers wrote:
> On Wed, Jun 26, 2019 at 1:49 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jun 25, 2019 at 11:15:57AM -0700, Nick Desaulniers wrote:
> >
> > > Unreleased versions of Clang built from source can;
> >
> > I've bad experiences with using unreleased compilers; life is too short.
>
> Yes; but before release is when they need the help the most in order
> for testing to find regressions.
>
> >
> > > We're currently planning multiple output constraint support w/ asm
> > > goto, and have recently implemented things like
> > > __GCC_ASM_FLAG_OUTPUTS__.
> >
> > That's good to hear.
> >
> > > If there's other features that we should
> > > start implementing, please let us know.
> >
> > If you've got any ideas on how to make this:
> >
> > https://lkml.kernel.org/r/20190621120923.GT3463@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > work, that'd be nice. Basically I wanted the asm goto to emit a 2 or 5
> > byte JMP/NOP depending on the displacement size. We can trivially get
> > JMP right by using:
> >
> > jmp \l_yes
> >
> > and letting the assembler sort it, but getting the NOP right has so far
> > eluded me:
> >
> > .if \l_yes - (. + 2) < 127
> > .byte 0x66, 0x90
> > .else
> > .byte STATIC_KEY_INIT_NOP
> > .endif
> >
> > doesn't work. We can ofcourse unconditionally emit the JMP and then
> > rewrite the binary afterward, and replace the emitted jumps with the
> > right size NOP, but that's a bit yuck.
> >
> > Once it emits the variable size instruction consistently, we can update
> > the patching side to use the same condition to select the new
> > instruction (and fix objtool).
>
> Not sure; the assembler directives and their requirements aren't
> something I'm too familiar with.

Josh came up with the following:

+ /* If the jump target is close, do a 2-byte nop: */
+ ".skip -(%l[l_yes] - 1b <= 126), 0x66\n"
+ ".skip -(%l[l_yes] - 1b <= 126), 0x90\n"
+ /* Otherwise do a 5-byte nop: */
+ ".skip -(%l[l_yes] - 1b > 126), 0x0f\n"
+ ".skip -(%l[l_yes] - 1b > 126), 0x1f\n"
+ ".skip -(%l[l_yes] - 1b > 126), 0x44\n"
+ ".skip -(%l[l_yes] - 1b > 126), 0x00\n"
+ ".skip -(%l[l_yes] - 1b > 126), 0x00\n"

Which is a wonderfully gruesome hack :-) So I'll be playing with that
for a bit.