[PATCH] x86/AMD: Apply erratum 688 on machines without a BIOS fix

From: sonofagun
Date: Wed Oct 19 2016 - 10:37:16 EST



AMD F14h machines have an erratum which can cause unpredictable program behaviour under specific branch conditions. The workaround is to set MSRC001_1021[14] and MSRC001_1021[3]. Both bits are reserved for this MSR, so we trust AMD suggestions. Since there is no BIOS update containing that workaround for some machines, we do it ourselves unconditionally on this family too. Our Compaq CQ57 laptop which has broken firmware in various areas does not contain both workarounds(MSRc0011021: 0000000010208000)...

HP does not release a proper BIOS even though we have contacted them and requested an updated BIOS that will fix all errors we spotted. As it is not currently covered by any warranty, they do not support it. HP does not care, but Linux kernel cares to patch out-of-warranty hardware with crappy firmware!

Thanks to the author of commit d1992996753132e2dafe955cccb2fb0714d3cfc4 (x86/AMD: Apply erratum 665 on machines without a BIOS fix) as he paved the way to this fix. That patch was not applicable on our machine but it brought back to surface a long standing bug of our E-300 laptop. Poor performance under Debian was observed and things got worse after switching to Ubuntu as crashes became more frequent! As a result the laptop got replaced with a desktop.

After some time, we decided to dig deeper and see what is wrong with our laptop. Actually perf proved that something was terrible wrong as branch-misses reached 40% within a minute after booting the E-300 ontario C0 APU! Disabling the second CPU did not help either. CPU Revision Guide erratum 688 seemed promising as it described our issues and we prepared a fix. Now the laptop works and has both workarounds(MSRc0011021: 000000001020c008)! Since this erratum affects many laptops and some tablets, we request to backport it to stable kernels.

Tested on Compaq CQ57-499 laptop.


Signed-off-by: Ioannis Barkas <sonofagun@xxxxxxxxxxxxxxx>
Signed-off-by: Nikos Barkas <levelwol@xxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>

---

Hello we are Ioannis Barkas (sonofagun@xxxxxxxxxxxxxxx) and Nikos Barkas (levelwol@xxxxxxxxx).

This patch was sent from my yahoo e-mail in the morning and got rejected! Why?
Resending...

We have had poor performance on our AMD laptop with Debian for some years. Initial value of MSRc0011021 is 0000000010208000h and D18F4x164 is 00000003h. Our laptop was not usable even with Ubuntu 16.04 using the radeon driver. What is worse, opening firefox with https://planefinder.net/ after booting Ubuntu, resulted in firefox crashes again and again. After this patch we have not met any problem with that webpage and firefox. Unfortunately linux-tools were not present for our custom kernel and perf could not be launched:( When the patch arrives on Ubuntu 16.10 kernel, we shall recheck it. If branch-misses remain above 10%, we will open a bug for it.

--- a/arch/x86/kernel/cpu/amd.c 2016-10-07 16:03:33.000000000 +0300
+++ b/arch/x86/kernel/cpu/amd.c 2016-10-12 13:25:34.791720549 +0300
@@ -680,6 +680,18 @@ static void init_amd_ln(struct cpuinfo_x
msr_set_bit(MSR_AMD64_DE_CFG, 31);
}

+#define MSR_AMD64_IC_CFG 0xC0011021
+
+static void init_amd_on(struct cpuinfo_x86 *c)
+{
+ /*
+ * Apply erratum 688 fix unconditionally so machines without a BIOS
+ * fix work.
+ */
+ msr_set_bit(MSR_AMD64_IC_CFG, 3);
+ msr_set_bit(MSR_AMD64_IC_CFG, 14);
+}
+
static void init_amd_bd(struct cpuinfo_x86 *c)
{
u64 value;
@@ -738,6 +750,7 @@ static void init_amd(struct cpuinfo_x86
case 0xf: init_amd_k8(c); break;
case 0x10: init_amd_gh(c); break;
case 0x12: init_amd_ln(c); break;
+ case 0x14: init_amd_on(c); break;
case 0x15: init_amd_bd(c); break;
}