Re: x86/retpoline: Fill RSB on context switch for affected CPUs

From: Maciej S. Szmigiero
Date: Fri Mar 09 2018 - 08:13:00 EST


On 12.01.2018 18:49, Woodhouse, David wrote:
> When we context switch from a shallow call stack to a deeper one, as we
> 'ret' up the deeper side we may encounter RSB entries (predictions for
> where the 'ret' goes to) which were populated in userspace. This is
> problematic if we have neither SMEP nor KPTI (the latter of which marks
> userspace pages as NX for the kernel), as malicious code in userspace
> may then be executed speculatively. So overwrite the CPU's return
> prediction stack with calls which are predicted to return to an infinite
> loop, to "capture" speculation if this happens. This is required both
> for retpoline, and also in conjunction with IBRS for !SMEP && !KPTI.
>
> On Skylake+ the problem is slightly different, and an *underflow* of the
> RSB may cause errant branch predictions to occur. So there it's not so
> much overwrite, as *filling* the RSB to attempt to prevent it getting
> empty. This is only a partial solution for Skylake+ since there are many
> other conditions which may result in the RSB becoming empty. The full
> solution on Skylake+ is to use IBRS, which will prevent the problem even
> when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
> required on context switch.
>
> Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> Acked-by: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
> ---
(..)
> @@ -213,6 +230,23 @@ static void __init spectre_v2_select_mitigation(void)
>
> spectre_v2_enabled = mode;
> pr_info("%s\n", spectre_v2_strings[mode]);
> +
> + /*
> + * If we don't have SMEP or KPTI, then we run the risk of hitting
> + * userspace addresses in the RSB after a context switch from a
> + * shallow call stack to a deeper one. We must must fill the entire
> + * RSB to avoid that, even when using IBRS.
> + *
> + * Skylake era CPUs have a separate issue with *underflow* of the
> + * RSB, when they will predict 'ret' targets from the generic BTB.
> + * IBRS makes that safe, but we need to fill the RSB on context
> + * switch if we're using retpoline.
> + */
> + if ((!boot_cpu_has(X86_FEATURE_PTI) &&
> + !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
> + setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> + pr_info("Filling RSB on context switch\n");
> + }

Shouldn't the RSB filling on context switch also be done on non-IBPB
CPUs to protect (retpolined) user space tasks from other user space
tasks?

We already issue a IBPB when switching to high-value user space tasks
to protect them from other user space tasks.

Thanks,
Maciej