[PATCH] x86/mce: Initialize "bank" when we find a fatal error in mce_no_way_out()

From: Tony Luck
Date: Thu Jan 31 2019 - 19:41:27 EST


Internal injection testing crashed with a console log that said:

mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134

This caused a lot of head scratching because the MCACOD (bits 15:0) of that
status is a signature from an L1 data cache error. But Linux says that it found
it in "Bank 0", which on this model CPU only reports L1 instruction cache errors.

The answer was that Linux doesn't initialize "m->bank" in the case that it finds
a fatal error in the mce_no_way_out() pre-scan of banks. If this was a local machine
check, then we pass this partially initialized "struct mce" to mce_panic().

Fix is simple. Just initialize m->bank in the case that we found a fatal error.

Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
Cc: stable@xxxxxxxxxxxxxxx # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
---
arch/x86/kernel/cpu/mce/core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c7225cb1b..6ce290c506d9 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -784,6 +784,7 @@ static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp,
quirk_no_way_out(i, m, regs);

if (mce_severity(m, mca_cfg.tolerant, &tmp, true) >= MCE_PANIC_SEVERITY) {
+ m->bank = i;
mce_read_aux(m, i);
*msg = tmp;
return 1;
--
2.19.1