Re: [RFC][PATCH 2/2] x86/retpoline: Compress retpolines

From: Peter Zijlstra
Date: Mon Feb 22 2021 - 06:29:24 EST


On Fri, Feb 19, 2021 at 08:14:39AM +0100, Borislav Petkov wrote:
> On Thu, Feb 18, 2021 at 05:59:40PM +0100, Peter Zijlstra wrote:
> > By using int3 as a speculation fence instead of lfence, we can shrink
> > the longest alternative to just 15 bytes:
> >
> > 0: e8 05 00 00 00 callq a <.altinstr_replacement+0xa>
> > 5: f3 90 pause
> > 7: cc int3
> > 8: eb fb jmp 5 <.altinstr_replacement+0x5>
> > a: 48 89 04 24 mov %rax,(%rsp)
> > e: c3 retq
> >
> > This means we can change the alignment from 32 to 16 bytes and get 4
> > retpolines per cacheline, $I win.
>
> You mean I$ :)

Typin' so hard.

> In any case, for both:
>
> Reviewed-by: Borislav Petkov <bp@xxxxxxx>

Thanks, except I've been told there is a performance implication. But
since all that happened in sekrit, none of that is recorded :/

I was hoping for some people (Tony, Paul) to respond with more data.
Also, Andrew said that if we ditch the lfence we could also ditch the
pause.

So people, please speak up, and if possible share any data you still
might have from back when retpolines were developed such that we can
have it on record.