Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

From: Alexei Starovoitov
Date: Thu Jan 04 2018 - 13:36:09 EST


On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.
> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to adopt it, right?
> >
> > Paul, could you share a link to a set of alternative gcc patches
> > that do retpoline similar to llvm diff ?
>
> What is the alternative approach? Is it literally just doing a
>
> call 1f
> 1: mov real_target,(%rsp)
> ret
>
> on the assumption that the "ret" will always just predict to that "1"
> due to the call stack?

Pretty much.
Paul's writeup: https://support.google.com/faqs/answer/7625886
tldr: jmp *%r11 gets converted to:
call set_up_target;
capture_spec:
pause;
jmp capture_spec;
set_up_target:
mov %r11, (%rsp);
ret;
where capture_spec part will be looping speculatively.