Re: [PATCH] x86/retpoline: Avoid return buffer underflows on context switch

From: Peter Zijlstra
Date: Mon Jan 08 2018 - 17:11:36 EST


On Mon, Jan 08, 2018 at 12:15:31PM -0800, Andi Kleen wrote:
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index b8c8eeacb4be..e84e231248c2 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -53,6 +53,35 @@
> #endif
> .endm
>
> +/*
> + * We use 32-N: 32 is the max return buffer size,
> + * but there should have been at a minimum two
> + * controlled calls already: one into the kernel
> + * from entry*.S and another into the function
> + * containing this macro. So N=2, thus 30.
> + */
> +#define NUM_BRANCHES_TO_FILL 30
> +
> +/*
> + * Fill the CPU return branch buffer to prevent
> + * indirect branch prediction on underflow.
> + * Caller should check for X86_FEATURE_SMEP and X86_FEATURE_RETPOLINE
> + */
> +.macro FILL_RETURN_BUFFER
> +#ifdef CONFIG_RETPOLINE
> + .rept NUM_BRANCHES_TO_FILL
> + call 1221f
> + pause /* stop speculation */
> +1221:
> + .endr
> +#ifdef CONFIG_64BIT
> + addq $8*NUM_BRANCHES_TO_FILL, %rsp
> +#else
> + addl $4*NUM_BRANCHES_TO_FILL, %esp
> +#endif
> +#endif
> +.endm

So pjt did alignment, a single unroll and per discussion earlier today
(CET) or late last night (PST), he only does 16.

Why is none of that done here? Also, can we pretty please stop using
those retarded number labels, they make this stuff unreadable.

Also, pause is unlikely to stop speculation, that comment doesn't make
sense. Looking at PJT's version there used to be a speculation trap in
there, but I can't see that here.