RE: [linus:master] BUILD REGRESSION a2e5790d841658485d642196dbb0927303d6c22f

From: David Laight
Date: Thu Feb 08 2018 - 04:47:52 EST


From: Peter Zijlstra
> Sent: 08 February 2018 09:13
...
> > > Yeah, note says UD0 didn't eat a ModRM byte on old CPUs. But then that
> > > changed too. Fun stuff changing insn encoding underway.
> > >
> > > So if we opt for adding a ModRM byte, could a 0x90 NOP work so that it
> > > doesn't shit itself on those old CPUs?
> >
> > We could just also decide that the only thing that the modrm bytes of
> > UD0 actually *affect* is how the CPU might act for a page-crossing
> > instruction.
> >
> > Because I think that's the only semantic difference: if it's a
> > page-crosser, the instruction could take a page fault before raising
> > the #UD.
> >
> > Is there any other decode issue we might want to look out for?
>
> _The_ problem is that new binutils cannot sanely decode any function
> that has a WARN in (this very much includes perf annotate):
>
> old:
>
> 00000000000016a0 <copy_overflow>:
> 16a0: 48 89 f2 mov %rsi,%rdx
> 16a3: 89 fe mov %edi,%esi
> 16a5: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
> 16a8: R_X86_64_32S .rodata.str1.8+0x288
> 16ac: e8 00 00 00 00 callq 16b1 <copy_overflow+0x11>
> 16ad: R_X86_64_PC32 __warn_printk-0x4
> 16b1: 0f ff (bad)
> 16b3: c3 retq
> 16b4: 66 90 xchg %ax,%ax
> 16b6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> 16bd: 00 00 00
>
> new:
>
> 00000000000016a0 <copy_overflow>:
> 16a0: 48 89 f2 mov %rsi,%rdx
> 16a3: 89 fe mov %edi,%esi
> 16a5: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
> 16a8: R_X86_64_32S .rodata.str1.8+0x288
> 16ac: e8 00 00 00 00 callq 16b1 <copy_overflow+0x11>
> 16ad: R_X86_64_PC32 __warn_printk-0x4
> 16b1: 0f ff c3 ud0 %ebx,%eax
> 16b4: 66 90 xchg %ax,%ax
> 16b6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> 16bd: 00 00 00
>
>
> I went through the register opcodes and matched it against the ModR/M
> encoding, and the best option I've found so far is using 0xd6 as the
> next byte.

Wouldn't 0xc3 work as well.
A retq is probably better than an extra (bad).

Actually objdump ought to be more explicit than (bad) for the explicit UD0/1

David