Re: [PATCH] perf/x86/intel: Mark expected switch fall-throughs
From: Peter Zijlstra
Date: Wed Jun 26 2019 - 12:31:27 EST
On Tue, Jun 25, 2019 at 11:47:06PM +0200, Thomas Gleixner wrote:
> > On Tue, Jun 25, 2019 at 09:53:09PM +0200, Thomas Gleixner wrote:
> > > but it also makes objtool unhappy:
> > > arch/x86/kernel/cpu/mtrr/generic.o: warning: objtool: get_fixed_ranges()+0x9b: unreachable instruction
> I just checked two of them in the disassembly. In both cases it's jump
> label related. Here is one:
>
> asm volatile("1: rdmsr\n"
> 410: b9 59 02 00 00 mov $0x259,%ecx
> 415: 0f 32 rdmsr
> 417: 49 89 c6 mov %rax,%r14
> 41a: 48 89 d3 mov %rdx,%rbx
> return EAX_EDX_VAL(val, low, high);
> 41d: 48 c1 e3 20 shl $0x20,%rbx
> 421: 48 09 c3 or %rax,%rbx
> 424: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 429: eb 0f jmp 43a <get_fixed_ranges+0xaa>
> do_trace_read_msr(msr, val, 0);
> 42b: bf 59 02 00 00 mov $0x259,%edi <------- "unreachable"
> 430: 48 89 de mov %rbx,%rsi
> 433: 31 d2 xor %edx,%edx
> 435: e8 00 00 00 00 callq 43a <get_fixed_ranges+0xaa>
> 43a: 44 89 35 00 00 00 00 mov %r14d,0x0(%rip) # 441 <get_fixed_ranges+0xb1>
Thomas provided the actual .o file, and from that we find that the
.rela__jump_table entries look like:
000000000010 000100000002 R_X86_64_PC32 0000000000000000 .text + 3e9
000000000014 000100000002 R_X86_64_PC32 0000000000000000 .text + 3f0
000000000018 006100000018 R_X86_64_PC64 0000000000000000 __tracepoint_read_msr + 8
000000000020 000100000002 R_X86_64_PC32 0000000000000000 .text + 424
000000000024 000100000002 R_X86_64_PC32 0000000000000000 .text + 3f0
000000000028 006100000018 R_X86_64_PC64 0000000000000000 __tracepoint_read_msr + 8
>From this we find that the jump target that goes with the NOP at +424 is
+3f0, not +42b as one would expect.
And as Josh noted, it is also 'weird' that this +3f0 is the very same as
the target for the previous entry.
When we compare the code at both sites, we find:
3f0: bf 58 02 00 00 mov $0x258,%edi
3f5: 48 89 de mov %rbx,%rsi
3f8: 31 d2 xor %edx,%edx
3fa: e8 00 00 00 00 callq 3ff <get_fixed_ranges+0x6f>
3fb: R_X86_64_PC32 do_trace_read_msr-0x4
vs
42b: bf 59 02 00 00 mov $0x259,%edi
430: 48 89 de mov %rbx,%rsi
433: 31 d2 xor %edx,%edx
435: e8 00 00 00 00 callq 43a <get_fixed_ranges+0xaa>
436: R_X86_64_PC32 do_trace_read_msr-0x4
Which is not in fact the same code.
So for some reason the .rela__jump_table are buggy on this clang build.