Re: [PATCH v2 14/14] objtool,x86: Rewrite retpoline thunk calls

From: Peter Zijlstra
Date: Fri Mar 19 2021 - 04:07:39 EST


On Thu, Mar 18, 2021 at 10:29:55PM -0500, Josh Poimboeuf wrote:
> On Thu, Mar 18, 2021 at 06:11:17PM +0100, Peter Zijlstra wrote:
> > When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
> > indirect call, have objtool rewrite it to:
> >
> > ALTERNATIVE "call __x86_indirect_thunk_\reg",
> > "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)
> >
> > Additionally, in order to not emit endless identical
> > .altinst_replacement chunks, use a global symbol for them, see
> > __x86_indirect_alt_*.
> >
> > This also avoids objtool from having to do code generation.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>
> This is better than I expected. Nice workaround for not generating
> code.

Thanks :-)

> > +.macro ALT_THUNK reg
> > +
> > + .align 1
> > +
> > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
> > + ANNOTATE_RETPOLINE_SAFE
> > +1: call *%\reg
> > +2: .skip 5-(2b-1b), 0x90
> > +SYM_FUNC_END(__x86_indirect_alt_call_\reg)
> > +
> > +SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
> > + ANNOTATE_RETPOLINE_SAFE
> > +1: jmp *%\reg
> > +2: .skip 5-(2b-1b), 0x90
> > +SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
>
> This mysterious code needs a comment. Shouldn't it be in
> .altinstr_replacement or something?

Comment, yes, I suppose so. And no, if we stick it in
.altinstr_replacement we'll throw them away with initmem and module
alternative patching (which will also refer to these symbols) will go
side-ways.

> Also doesn't the alternative code already insert nops?

Problem is that the {call,jmp} *%\reg thing is not fixed length. They're
2 or 3 bytes depending on which register is picked.

We could make them all 3 long and insert 0,1 nop I suppose.

Initially alternatives wouldn't re-optimize nops on patched things, it
would simply add nops on. And I had the above be:

1: INSN *%\reg
2: .nops 5-(2b-1b)

and we'd get a single right sized nop. But the .nops directive it too
new, we support binutils that don't have it :/

Hence, it now reads:

2: .skip 5-(2b-1b), 0x90

End result is that alternative NOP optimizer patch at the start of the
series that now also optimizes a bunch of cases that are unrelated and
were previously missed -- but crucially, it covers this case too :-)

Anyway, yes I could make it 3 long.

> > +int arch_rewrite_retpoline(struct objtool_file *file,
> > + struct instruction *insn,
> > + struct reloc *reloc)
> > +{
> > + struct symbol *sym;
> > + char name[32] = "";
> > +
> > + if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
> > + return 0;
> > +
> > + sprintf(name, "__x86_indirect_alt_%s_%s",
> > + insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
> > + reloc->sym->name + 21);
> > +
> > + sym = find_symbol_by_name(file->elf, name);
> > + if (!sym) {
> > + sym = elf_create_undef_symbol(file->elf, name);
> > + if (!sym) {
> > + WARN("elf_create_undef_symbol");
> > + return -1;
> > + }
> > + }
> > +
> > + elf_add_alternative(file->elf, insn, sym,
> > + ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5);
> > +
> > + return 0;
> > +}
>
> Need to propagate the error.

Oh, indeed so.