Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

From: Paul Turner
Date: Fri Jan 05 2018 - 05:41:04 EST


On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote:
> On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > Clearly Paul's approach to retpoline without lfence is faster.

Using pause rather than lfence does not represent a fundamental difference here.

A protected indirect branch is always adding ~25-30 cycles of overhead.

That this can be avoided in practice is a function of two key factors:
(1) Kernel code uses fewer indirect branches.
(2) The overhead can be avoided for hot indirect branches via devirtualization.
e.g. the semantic equivalent of,
if (ptr == foo)
foo();
else
(*ptr)();
Allowing foo() to be called directly, even though it was provided as an
indirect.

> > I'm guessing it wasn't shared with amazon/intel until now and
> > this set of patches going to adopt it, right?
> >
> > Paul, could you share a link to a set of alternative gcc patches
> > that do retpoline similar to llvm diff ?
>
> What is the alternative approach? Is it literally just doing a
>
> call 1f
> 1: mov real_target,(%rsp)
> ret
>
> on the assumption that the "ret" will always just predict to that "1"
> due to the call stack?
>
> Linus