[tip:x86/urgent] x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()

From: tip-bot for Tony Luck
Date: Sun Feb 03 2019 - 07:37:28 EST


Commit-ID: d28af26faa0b1daf3c692603d46bc4687c16f19e
Gitweb: https://git.kernel.org/tip/d28af26faa0b1daf3c692603d46bc4687c16f19e
Author: Tony Luck <tony.luck@xxxxxxxxx>
AuthorDate: Thu, 31 Jan 2019 16:33:41 -0800
Committer: Borislav Petkov <bp@xxxxxxx>
CommitDate: Sun, 3 Feb 2019 13:24:24 +0100

x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()

Internal injection testing crashed with a console log that said:

mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134

This caused a lot of head scratching because the MCACOD (bits 15:0) of
that status is a signature from an L1 data cache error. But Linux says
that it found it in "Bank 0", which on this model CPU only reports L1
instruction cache errors.

The answer was that Linux doesn't initialize "m->bank" in the case that
it finds a fatal error in the mce_no_way_out() pre-scan of banks. If
this was a local machine check, then this partially initialized struct
mce is being passed to mce_panic().

Fix is simple: just initialize m->bank in the case of a fatal error.

Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Cc: x86-ml <x86@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@xxxxxxxxx
---
arch/x86/kernel/cpu/mce/core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 672c7225cb1b..6ce290c506d9 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -784,6 +784,7 @@ static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp,
quirk_no_way_out(i, m, regs);

if (mce_severity(m, mca_cfg.tolerant, &tmp, true) >= MCE_PANIC_SEVERITY) {
+ m->bank = i;
mce_read_aux(m, i);
*msg = tmp;
return 1;