Re: [PATCH 1/3] Revert "x86/retpoline: Remove .text..__x86.return_thunk section"

From: Josh Poimboeuf
Date: Wed Oct 11 2023 - 22:28:06 EST


On Thu, Oct 12, 2023 at 12:35:13AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 11, 2023 at 09:28:43AM -0700, Josh Poimboeuf wrote:
> > On Wed, Oct 11, 2023 at 09:41:42AM +0200, Peter Zijlstra wrote:
> > > > +++ b/tools/objtool/check.c
> > > > @@ -1610,6 +1610,15 @@ static int add_jump_destinations(struct objtool_file *file)
> > > > return -1;
> > > > }
> > > >
> > > > + /*
> > > > + * Since retpolines are in the same section as the return
> > > > + * thunk, they might not use a relocation when branching to it.
> > > > + */
> > > > + if (jump_dest->sym && jump_dest->sym->return_thunk) {
> > > > + add_return_call(file, insn, true);
> > > > + continue;
> > > > + }
> > >
> > > *urgh*... I mean, yes, that obviously works, but should we not also have
> > > the retpoline thingy for consistency? That case makes less sense though
> > > :/
> >
> > Consistency with what?
>
> the reloc case; specifically, I was thinking something along these
> lines:
>
> if (jump-dest->sym && jump_dest->sym->retpoline_thunk) {
> add_retpoline_call(file, insn);
> continue;
> }
>
> Then both reloc and immediate versions are more or less the same.

Ok, I see. If somebody were to do a {JMP,CALL}_NOSPEC somewhere in
retpoline.S, it would break similarly.

It doesn't hurt to add that to the patch, that way retpoline/rethunk
site detection is all handled the same.

> > The extra section seems pointless but maybe I'm missing something.
>
> By having the section things are better delineated I suppose, be it
> retpolines or rethunks, all references should be to inside the section
> (and thus have a reloc) while within the section there should never be
> a reference to itself.

As far as delineating things goes, a good argument could be made to
instead do that on the source code level. i.e., put the rethunk code in
rethunk.S rather than retpoline.S. Incidentally, that would also fix
this problem.

>From an object code standpoint, objtool is the only one who cares about
the relocs. It's a good idea to make objtool more robust against
non-relocs regardless, as the reloc assumption could always be broken
later by LTO.

BTW, I wonder if we can also get rid of .text..__x86.indirect_thunk and
just put most of the retpoline/rethunk code in .text (other than than
the SRSO aliasing hacks which still need special placement).

--
Josh