Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs

From: Nick Desaulniers
Date: Wed Jun 26 2019 - 18:33:50 EST


On Wed, Jun 26, 2019 at 9:31 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
>
> > > > but it also makes objtool unhappy:
>
> > > > arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
>
> > I just checked two of them in the disassembly. In both cases it's jump
> > label related. Here is one:
> >
> > asm volatile("1: rdmsr\n"
> > 410: b9 59 02 00 00 mov $0x259,%ecx
> > 415: 0f 32 rdmsr
> > 417: 49 89 c6 mov %rax,%r14
> > 41a: 48 89 d3 mov %rdx,%rbx
> > return EAX_EDX_VAL(val, low, high);
> > 41d: 48 c1 e3 20 shl $0x20,%rbx
> > 421: 48 09 c3 or %rax,%rbx
> > 424: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > 429: eb 0f jmp 43a <get_fixed_ranges+0xaa>
> > do_trace_read_msr(msr, val, 0);
> > 42b: bf 59 02 00 00 mov $0x259,%edi <------- "unreachable"

I assume if 0x42b is unreachable, that's bad as $0x259 is never stored
in %edi before the call to get_fixed_ranges+0xaa...

> > 430: 48 89 de mov %rbx,%rsi
> > 433: 31 d2 xor %edx,%edx
> > 435: e8 00 00 00 00 callq 43a <get_fixed_ranges+0xaa>
> > 43a: 44 89 35 00 00 00 00 mov %r14d,0x0(%rip) # 441 <get_fixed_ranges+0xb1>
>
> Thomas provided the actual .o file, and from that we find that the
> .rela__jump_table entries look like:
>
> 000000000010 000100000002 R_X86_64_PC32 0000000000000000 .text + 3e9
> 000000000014 000100000002 R_X86_64_PC32 0000000000000000 .text + 3f0
> 000000000018 006100000018 R_X86_64_PC64 0000000000000000 __tracepoint_read_msr + 8

I assume these relocations come from arch_static_branch() (and thus
appear in triples?)

21 static __always_inline bool arch_static_branch(struct static_key
*key, bool branch)
22 {
23 asm_volatile_goto("1:"
24 ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
25 ".pushsection __jump_table, \"aw\" \n\t"
26 _ASM_ALIGN "\n\t"
27 ".long 1b - ., %l[l_yes] - . \n\t" // 1, 2
28 _ASM_PTR "%c0 + %c1 - .\n\t" // 3
29 ".popsection \n\t"
30 : : "i" (key), "i" (branch) : : l_yes);

> 000000000020 000100000002 R_X86_64_PC32 0000000000000000 .text + 424
> 000000000024 000100000002 R_X86_64_PC32 0000000000000000 .text + 3f0
> 000000000028 006100000018 R_X86_64_PC64 0000000000000000 __tracepoint_read_msr + 8
>
> From this we find that the jump target that goes with the NOP at +424 is
> +3f0, not +42b as one would expect.
>
> And as Josh noted, it is also 'weird' that this +3f0 is the very same as
> the target for the previous entry.

(Ok, I think I did talk to Josh about this, and I think he did mention
something about the jump targets, but I didn't really understand the
issue well at the time).

>
> When we compare the code at both sites, we find:
>
> 3f0: bf 58 02 00 00 mov $0x258,%edi
> 3f5: 48 89 de mov %rbx,%rsi
> 3f8: 31 d2 xor %edx,%edx
> 3fa: e8 00 00 00 00 callq 3ff <get_fixed_ranges+0x6f>
> 3fb: R_X86_64_PC32 do_trace_read_msr-0x4
>
> vs
>
> 42b: bf 59 02 00 00 mov $0x259,%edi
> 430: 48 89 de mov %rbx,%rsi
> 433: 31 d2 xor %edx,%edx
> 435: e8 00 00 00 00 callq 43a <get_fixed_ranges+0xaa>
> 436: R_X86_64_PC32 do_trace_read_msr-0x4
>
> Which is not in fact the same code.
>
> So for some reason the .rela__jump_table are buggy on this clang build.

So that sounds like a correctness bug then. (I'd been doing testing
with the STATIC_KEYS_SELFTEST, which I guess doesn't expose this).
I'm kind of surprised we can boot and pass STATIC_KEYS_SELFTEST. Any
way you can help us pare down a test case?
--
Thanks,
~Nick Desaulniers