[PATCH 3/3] x86/mce: include type of core when reporting a machine check error

From: Ricardo Neri
Date: Fri Oct 02 2020 - 16:17:43 EST


In hybrid parts, each type of core reports different types of machine check
errors as the machine check error blocks are tied to different parts of the
hardware. Furthermore, errors may be different across micro-architecture
versions. Thus, in order to decode errors, userspace tools need to know the
type of core as well as the native model ID of the CPU which reported the
error.

This extra information is only included in the error report only when
running on hybrid parts. This conserves the existing behavior when running
on non-hybrid parts. Hence, legacy userspace tools running on new kernels
and hybrid hardware can still understand the format of the reported error
format.

Cc: "Ravi V Shankar" <ravi.v.shankar@xxxxxxxxx>
Cc: linux-edac@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
---
arch/x86/include/uapi/asm/mce.h | 1 +
arch/x86/kernel/cpu/mce/core.c | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index db9adc081c5a..e730572186d6 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -36,6 +36,7 @@ struct mce {
__u64 ppin; /* Protected Processor Inventory Number */
__u32 microcode; /* Microcode revision */
__u64 kflags; /* Internal kernel use */
+ __u32 hybrid_info; /* Type and native model ID in hybrid parts */
};

#define MCE_GET_RECORD_LEN _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index a6ff407dec71..ecac8d9b6070 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -143,6 +143,9 @@ noinstr void mce_setup(struct mce *m)
m->apicid = cpu_data(m->extcpu).initial_apicid;
m->mcgcap = __rdmsr(MSR_IA32_MCG_CAP);

+ if (this_cpu_has(X86_FEATURE_HYBRID_CPU))
+ m->hybrid_info = cpuid_eax(0x1a);
+
if (this_cpu_has(X86_FEATURE_INTEL_PPIN))
m->ppin = __rdmsr(MSR_PPIN);
else if (this_cpu_has(X86_FEATURE_AMD_PPIN))
@@ -264,6 +267,10 @@ static void __print_mce(struct mce *m)
pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode %x\n",
m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
m->microcode);
+
+ if (this_cpu_has(X86_FEATURE_HYBRID_CPU))
+ pr_emerg(HW_ERR "HYBRID_TYPE %x HYBRID_NATIVE_MODEL_ID %x\n",
+ m->hybrid_info >> 24, m->hybrid_info & 0xffffff);
}

static void print_mce(struct mce *m)
--
2.17.1