Re: [PATCH 2/2] objtool: Optimize/fix retpoline alternative generation

From: Peter Zijlstra
Date: Fri Oct 08 2021 - 06:36:24 EST


On Fri, Oct 08, 2021 at 12:23:25AM -0700, Josh Poimboeuf wrote:
> On Thu, Oct 07, 2021 at 11:22:13PM +0200, Peter Zijlstra wrote:
> > When re-running objtool it will generate alterantives for all
>
> "alternatives"
>
> > retpoline hunks, even if they are already present.
> >
> > Discard the retpoline alternatives later so we can mark the
>
> Discard? or mark as ignored?

I used 'discard' since we don't actually generate insn->alts entries.

> > +++ b/tools/objtool/check.c
> > @@ -1468,6 +1468,14 @@ static int add_special_section_alts(stru
> > ret = -1;
> > goto out;
> > }
> > + /*
> > + * Skip (but mark) the retpoline alternatives so that we
> > + * don't generate them again.
> > + */
>
> I'm having a lot of trouble following this comment. In my half-sleeping
> state I'm theorizing this serves two purposes:
>
> 1) skip validating the alt (because why?)
>
> and
>
> 2) if re-running objtool on the object, don't generate a duplicate
> alternative? or maybe it also avoids duplicates for retpoline
> alternatives which were created in asm code?
>
> Not sure if I'm right but either way the comment needs more content.
>
> Also not sure about $SUBJECT, maybe it can be more specific.

Below better?

> BTW, this "re-running objtool" thing is probably a bigger problem that
> can be handled more broadly. When writing an object, we could write a
> dummy discard section ".discard.objtool_wuz_here" which tells it not to
> touch it a second time as weird things could happen.

Section can't work, since we run the first pass on individual
translations units, so if we get the wuz_here tag from one TU we can't
tell if we perhaps forgot to run on another.

Better detecting if there's actual work to do seems safer to me.

What I actually did yesterday was hack up --noinstr to WARN if there was
an elf modification done, I could turn that into a --ro flag or
something, which we can set on vmlinux if it's supposed to be a pure
validation pass.

---
Subject: objtool: Optimize retpoline alternative generation
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Thu Oct 7 23:15:34 CEST 2021

When re-running objtool it will generate alternatives for all
retpoline hunks, even if they are already present.

Instead of early discarding the retpoline alterantives, hang onto them
a little longer such that the instructions can be marked as already
having an alternative, which then in turn enables avoiding generating
another one.

Having multiple alternatives for a single site is harmless, provided
they're identical, however it does waste time and space.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
tools/objtool/arch/x86/decode.c | 3 +++
tools/objtool/check.c | 11 +++++++++++
tools/objtool/special.c | 8 --------
3 files changed, 14 insertions(+), 8 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -806,6 +806,9 @@ int arch_rewrite_retpolines(struct objto
if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
continue;

+ if (insn->ignore_alts)
+ continue;
+
reloc = insn->reloc;

sprintf(name, "__x86_indirect_alt_%s_%s",
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1477,6 +1477,17 @@ static int add_special_section_alts(stru
ret = -1;
goto out;
}
+
+ /*
+ * Don't generate alternative instruction streams
+ * (insn->alts) but instead mark the retpoline call as
+ * already having an alternative, so that we can avoid
+ * generating another instance.
+ */
+ if (new_insn->func && arch_is_retpoline(new_insn->func)) {
+ orig_insn->ignore_alts = true;
+ continue;
+ }
}

if (special_alt->group) {
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -109,14 +109,6 @@ static int get_alt_entry(struct elf *elf
return -1;
}

- /*
- * Skip retpoline .altinstr_replacement... we already rewrite the
- * instructions for retpolines anyway, see arch_is_retpoline()
- * usage in add_{call,jump}_destinations().
- */
- if (arch_is_retpoline(new_reloc->sym))
- return 1;
-
reloc_to_sec_off(new_reloc, &alt->new_sec, &alt->new_off);

/* _ASM_EXTABLE_EX hack */