[tip:ras/urgent] x86/mce: Check for alternate indication of machine check recovery on Skylake

From: tip-bot for Tony Luck
Date: Thu Jun 07 2018 - 16:25:37 EST


Commit-ID: 4c5717da1d021cf368eabb3cb1adcaead56c0d1e
Gitweb: https://git.kernel.org/tip/4c5717da1d021cf368eabb3cb1adcaead56c0d1e
Author: Tony Luck <tony.luck@xxxxxxxxx>
AuthorDate: Fri, 25 May 2018 14:42:09 -0700
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Thu, 7 Jun 2018 22:22:12 +0200

x86/mce: Check for alternate indication of machine check recovery on Skylake

Currently we just check the "CAPID0" register to see whether the CPU
can recover from machine checks.

But there are also some special SKUs which do not have all advanced
RAS features, but do enable machine check recovery for use with NVDIMMs.

Add a check for any of bits {8:5} in the "CAPID5" register (each
reports some NVDIMM mode available, if any of them are set, then
the system supports memory machine check recovery).

Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Qiuxu Zhuo <qiuxu.zhuo@xxxxxxxxx>
Cc: Ashok Raj <ashok.raj@xxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # 4.9
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxx>
Link: https://lkml.kernel.org/r/03cbed6e99ddafb51c2eadf9a3b7c8d7a0cc204e.1527283897.git.tony.luck@xxxxxxxxx

---
arch/x86/kernel/quirks.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 697a4ce04308..736348ead421 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -645,12 +645,19 @@ static void quirk_intel_brickland_xeon_ras_cap(struct pci_dev *pdev)
/* Skylake */
static void quirk_intel_purley_xeon_ras_cap(struct pci_dev *pdev)
{
- u32 capid0;
+ u32 capid0, capid5;

pci_read_config_dword(pdev, 0x84, &capid0);
+ pci_read_config_dword(pdev, 0x98, &capid5);

- if ((capid0 & 0xc0) == 0xc0)
+ /*
+ * CAPID0{7:6} indicate whether this is an advanced RAS SKU
+ * CAPID5{8:5} indicate that various NVDIMM usage modes are
+ * enabled, so memory machine check recovery is also enabled.
+ */
+ if ((capid0 & 0xc0) == 0xc0 || (capid5 & 0x1e0))
static_branch_inc(&mcsafe_key);
+
}
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x0ec3, quirk_intel_brickland_xeon_ras_cap);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2fc0, quirk_intel_brickland_xeon_ras_cap);