[PATCH] x86/amd: Work around Erratum 1386 - XSAVES malfunction on context switch

From: Andrew Cooper
Date: Tue Mar 07 2023 - 12:52:50 EST


AMD Erratum 1386 is summarised as:

XSAVES Instruction May Fail to Save XMM Registers to the Provided
State Save Area

This piece of accidental chronomancy causes the %xmm registers to
occasionally reset back to an older value.

Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC
instruction (which works fine) is equivalent on affected parts.

Reported-by: Tavis Ormandy <taviso@xxxxxxxxx>
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CC: Ingo Molnar <mingo@xxxxxxxxxx>
CC: Borislav Petkov <bp@xxxxxxxxx>
CC: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
CC: x86@xxxxxxxxxx
CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
CC: linux-kernel@xxxxxxxxxxxxxxx
CC: Tavis Ormandy <taviso@xxxxxxxxx>
CC: Alexander Monakov <amonakov@xxxxxxxxx>

Only compile tested.

This wants backporting to all stable trees that understand XSAVES, but
before 5.19(?) needs the XSAVEC support backporting too...
---
arch/x86/kernel/cpu/amd.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 380753b14cab..f3a4bb479fd5 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -890,6 +890,17 @@ static void init_amd_zn(struct cpuinfo_x86 *c)
node_reclaim_distance = 32;
#endif

+ /*
+ * Work around Erratum 1386. The XSAVES instruction malfunctions in
+ * certain circumstances on Zen1/2 uarch, and not all parts have had
+ * updated microcode at the time of writing (March 2023).
+ *
+ * Affected parts all have no supervisor XSAVE states, meaning that
+ * the XSAVEC instruction (which works fine) is equivelent.
+ */
+ if (c->x86 == 0x17)
+ clear_cpu_cap(c, X86_FEATURE_XSAVES);
+
/* Fix up CPUID bits, but only if not virtualised. */
if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) {

--
2.30.2