Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
From: Nick Desaulniers
Date: Fri Jun 28 2019 - 14:44:30 EST
On Fri, Jun 28, 2019 at 6:31 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Jun 27, 2019 at 09:12:50AM +0200, Peter Zijlstra wrote:
>
> > Josh came up with the following:
> >
> > + /* If the jump target is close, do a 2-byte nop: */
> > + ".skip -(%l[l_yes] - 1b <= 126), 0x66\n"
> > + ".skip -(%l[l_yes] - 1b <= 126), 0x90\n"
> > + /* Otherwise do a 5-byte nop: */
> > + ".skip -(%l[l_yes] - 1b > 126), 0x0f\n"
> > + ".skip -(%l[l_yes] - 1b > 126), 0x1f\n"
> > + ".skip -(%l[l_yes] - 1b > 126), 0x44\n"
> > + ".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> > + ".skip -(%l[l_yes] - 1b > 126), 0x00\n"
> >
> > Which is a wonderfully gruesome hack :-) So I'll be playing with that
> > for a bit.
>
> For those with interest; full patches at:
>
> https://lkml.kernel.org/r/20190628102113.360432762@xxxxxxxxxxxxx
Do you have a branch pushed that I can pull this from to quickly test w/ Clang?
The .skip trick is wild; I don't quite understand the negation in the
above or patch 8/8 for is_byte/is_long.
Also, the comment on 8/8 about patching early hits home; we had a
sign-extending-booleans bug that was causing the address calculation
to be off by two. Jann and Bill had to help me debug that one, and
funnily enough Kees fixed it in LLVM. Fetching exception frames out
of early_idt_handler_common has been my most memorable kernel
debugging experience to date, and hope I don't have to do that ever
again. Kees this week adjusted where arm64 does static_key enablement
(moved it earlier for Alexander Potapenko's slab
initialization/poisoning set).
For the wrong __jump_table entry; I consider that a critical issue we
need to fix before the clang-9 release. I'm unloading my current
responsibilities at work to be able to sit and focus on bug. I'll
probably start a new thread with you, tglx, Josh, and our mailing list
next week (sorry for co-opting this thread). I have been using
creduce quite successfully for finding and fixing our previous codegen
bugs (https://nickdesaulniers.github.io/blog/2019/01/18/finding-compiler-bugs-with-c-reduce/),
but I need to sit and understand the precise failure more in order to
reduce the input. We can see pretty well where in the compilation
pipeline things go wrong; I just find it hard to page through large
inputs such as whole translation units.
--
Thanks,
~Nick Desaulniers