[RFC][PATCH] x86: proposed new ARCH_CAPABILITIES MSR bit for RSB-underflow

From: Dave Hansen
Date: Fri Feb 16 2018 - 14:23:26 EST



Intel is considering adding a new bit to the IA32_ARCH_CAPABILITIES
MSR to tell when RSB underflow might be happen. Feedback on this
would be greatly appreciated before the specification is finalized.

---

Background:

The RSB is a microarchitectural structure that attempts to help
predict the branch target of RET instructions. It is implemented as a
stack that is pushed on CALL and popped on RET. Being a stack, it can
become empty. On some processors, an empty condition leads to use of
the other indirect branch predictors which have been targeted by
Spectre variant 2 (branch target injection) exploits.

Processors based on Skylake and its close derivatives have this
fallback behavior and need additional mitigation to avoid RSB-empty
conditions. Right now, the only place we do this "RSB stuffing"
operation is at context switch. We currently have a model/family list
to decide where to deploy this.

Problem:

However, that causes a problem in virtualization environments. They
routinely expose a different model/family to guests than what the
bare-metal hardware has. This, among other things, makes it easy to
migrate guests between different bare-metal systems with different
capabilities. However, this defeats the Skylake-generation
model/family detection.

Solution:

To help address this issue, Intel is proposing a new bit in the
IA32_ARCH_CAPABILITIES MSR. This bit, "RSB Override" (RSBO) would
indicate:

The CPU may predict the target of RET instructions with a
predictor other than the RSB when the RSB is 'empty'.

Hardware implementations may choose to set this, but it can also be
set by a hypervisor that traps RDMSR and simply wants to indicate to a
guest that it should deploy RSB-underflow mitigations.

An OS should assume that RSB-underflow mitigations are needed both
when RSBO=1 or when running on Skylake-generation processors with
RSBO=0.

Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: gnomes@xxxxxxxxxxxxxxxxxxx
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: thomas.lendacky@xxxxxxx
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Jiri Kosina <jikos@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxxx>
Cc: Paul Turner <pjt@xxxxxxxxxx>
Cc: David Woodhouse <dwmw@xxxxxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: Asit Mallick <asit.k.mallick@xxxxxxxxx>

---

b/arch/x86/include/asm/msr-index.h | 1
b/arch/x86/kernel/cpu/bugs.c | 47 +++++++++++++++++++++++++++++++++----
2 files changed, 43 insertions(+), 5 deletions(-)

diff -puN arch/x86/kernel/cpu/bugs.c~need-rsb-stuffing arch/x86/kernel/cpu/bugs.c
--- a/arch/x86/kernel/cpu/bugs.c~need-rsb-stuffing 2018-02-16 10:06:56.807610157 -0800
+++ b/arch/x86/kernel/cpu/bugs.c 2018-02-16 10:43:24.281604702 -0800
@@ -218,6 +218,43 @@ static bool __init is_skylake_era(void)
return false;
}

+/*
+ * This MSR has a bit to indicate whether the processor might fall back to
+ * the BTB. Hypervisors might lie about the model/family, breaking the
+ * is_skylake_era() check. They might also want the OS to deploy
+ * mitigations because it *might* get migrated to other hardware that has
+ * this behavior, even if current bare-metal hardware is not exposed.
+ */
+static bool cpu_has_rsb_override(void)
+{
+ u64 ia32_cap = 0;
+
+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ /* RSBO == RSB Override */
+ if (ia32_cap & ARCH_CAP_RSBO)
+ return true;
+
+ return false;
+}
+
+/*
+ * Can a RET instruction on this CPU fall back to the BTB?
+ */
+static bool __init cpu_ret_uses_btb(void)
+{
+ /* All Skylake-era processors fall back to BTB */
+ if (is_skylake_era())
+ return true;
+
+ /* Does the ARCH_CAPABILITIES override the model/family we see? */
+ if (cpu_has_rsb_override())
+ return true;
+
+ return false;
+}
+
static void __init spectre_v2_select_mitigation(void)
{
enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -283,14 +320,14 @@ retpoline_auto:
* from a shallow call stack to a deeper one. To prevent this fill
* the entire RSB, even when using IBRS.
*
- * Skylake era CPUs have a separate issue with *underflow* of the
- * RSB, when they will predict 'ret' targets from the generic BTB.
- * The proper mitigation for this is IBRS. If IBRS is not supported
- * or deactivated in favour of retpolines the RSB fill on context
+ * Some CPUs have a separate issue with *underflow* of the RSB,
+ * when they will predict 'ret' targets from the generic BTB. The
+ * proper mitigation for this is IBRS. If IBRS is not supported or
+ * deactivated in favour of retpolines the RSB fill on context
* switch is required.
*/
if ((!boot_cpu_has(X86_FEATURE_PTI) &&
- !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
+ !boot_cpu_has(X86_FEATURE_SMEP)) || cpu_ret_uses_btb()) {
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Spectre v2 mitigation: Filling RSB on context switch\n");
}
diff -puN arch/x86/include/asm/msr-index.h~need-rsb-stuffing arch/x86/include/asm/msr-index.h
--- a/arch/x86/include/asm/msr-index.h~need-rsb-stuffing 2018-02-16 10:10:25.738609636 -0800
+++ b/arch/x86/include/asm/msr-index.h 2018-02-16 10:12:22.880609344 -0800
@@ -68,6 +68,7 @@
#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */
#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */
+#define ARCH_CAP_RSBO (1 << 2) /* Needs RSB Stuffing */

#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e
_