Re: [PATCH 4/9] x86/alternative: Implement .retpoline_sites support

From: Alexander Lobakin
Date: Tue Oct 19 2021 - 05:47:39 EST


From: Alexander Lobakin <alobakin@xxxxx>
Date: Tue, 19 Oct 2021 00:25:30 +0000

> From: Alexander Lobakin <alobakin@xxxxx>
> Date: Mon, 18 Oct 2021 23:06:35 +0000
>
> Sorry for double posting, should've include this from the start.
>
> > Hi,
> >
> > Gave it a spin with Clang/LLVM, and
> >
> > > On Fri, Oct 15, 2021 at 04:24:08PM +0200, Borislav Petkov wrote:
> > > > On Wed, Oct 13, 2021 at 02:22:21PM +0200, Peter Zijlstra wrote:
> > > > > +static int patch_retpoline(void *addr, struct insn *insn, u8 *bytes)
> > > > > +{
> > > > > + void (*target)(void);
> > > > > + int reg, i = 0;
> > > > > +
> > > > > + if (cpu_feature_enabled(X86_FEATURE_RETPOLINE))
> > > > > + return -1;
> > > > > +
> > > > > + target = addr + insn->length + insn->immediate.value;
> > > > > + reg = (target - &__x86_indirect_thunk_rax) /
> > > > > + (&__x86_indirect_thunk_rcx - &__x86_indirect_thunk_rax);
> >
> > this triggers
> >
> > > > I guess you should compute those values once so that it doesn't have to
> > > > do them for each function invocation. And it does them here when I look
> > > > at the asm it generates.
> > >
> > > Takes away the simplicity of the thing. It can't know these values at
> > > compile time (due to external symbols etc..) although I suppose LTO
> > > might be able to fix that.
> > >
> > > Other than that, the above is the trivial form of reverse indexing an
> > > array.
> > >
> > > > > +
> > > > > + if (WARN_ON_ONCE(reg & ~0xf))
> > > > > + return -1;
> >
> > this:
> >
> > WARN in patch_retpoline:408: addr pcibios_scan_specific_bus+0x196/0x200, op 0xe8, reg 0xb88ca
> > WARN in patch_retpoline:408: addr xen_pv_teardown_msi_irqs+0x8d/0x120, op 0xe8, reg 0xb88ca
> > WARN in patch_retpoline:408: addr __mptcp_sockopt_sync+0x7e/0x200, op 0xe8, reg 0xb88ca
> > [...]
> > (thousands of them, but op == 0xe8 && reg == 0xb88ca are always the same)
>
> SMP alternatives: WARN in patch_retpoline:408: addr __strp_unpause+0x62/0x1b0/0xffffffff92d20a12, op 0xe8, reg 0xb88ca
> SMP alternatives: insn->length: 5, insn->immediate.value: 0xffae0989
> SMP alternatives: target: 0xffffffff928013a0/__x86_indirect_thunk_r11+0x0/0x20
> SMP alternatives: rax: 0xffffffff9223cd50, target - rax: 0x5c4650
> SMP alternatives: rcx - rax: 0x8
>
> Imm value and addr are different each time, the rest are the same.
> target is correct and even %pS works on it, but this distance
> between r11 and rax thunks (0x5c4650) doesn't look fine, as well as
> rcx - rax being 0x8. Thunks are 0x11 sized + alignment, should be
> 0x20, and it is, according to vmlinux.map. Weird. Amps/&s?

Oh okay, it's because of ClangCFI:

SMP alternatives: You were looking for __typeid__ZTSFvvE_global_addr+0x370/0x1410 at 0xffffffffa523cd60,>
SMP alternatives: rax is __typeid__ZTSFvvE_global_addr+0x360/0x1410 at 0xffffffffa523cd50

Sorry for confusing, seems like it's a side effect of using it on
Clang 12 while the original series supports only 13+. I'll double
check and let know if find something.

[ snip ]

Thanks,
Al