[tip:x86/urgent] x86/MCE/AMD: Fix the thresholding machinery initialization order

From: tip-bot for Borislav Petkov
Date: Wed Nov 28 2018 - 04:22:27 EST


Commit-ID: 60c8144afc287ef09ce8c1230c6aa972659ba1bb
Gitweb: https://git.kernel.org/tip/60c8144afc287ef09ce8c1230c6aa972659ba1bb
Author: Borislav Petkov <bp@xxxxxxx>
AuthorDate: Tue, 27 Nov 2018 14:41:37 +0100
Committer: Borislav Petkov <bp@xxxxxxx>
CommitDate: Wed, 28 Nov 2018 10:10:36 +0100

x86/MCE/AMD: Fix the thresholding machinery initialization order

Currently, the code sets up the thresholding interrupt vector and only
then goes about initializing the thresholding banks. Which is wrong,
because an early thresholding interrupt would cause a NULL pointer
dereference when accessing those banks and prevent the machine from
booting.

Therefore, set the thresholding interrupt vector only *after* having
initialized the banks successfully.

Fixes: 18807ddb7f88 ("x86/mce/AMD: Reset Threshold Limit after logging error")
Reported-by: RafaÅ MiÅecki <rafal@xxxxxxxxxx>
Reported-by: John Clemens <clemej@xxxxxxxxx>
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
Tested-by: RafaÅ MiÅecki <rafal@xxxxxxxxxx>
Tested-by: John Clemens <john@xxxxxxxxxx>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@xxxxxxxxx>
Cc: linux-edac@xxxxxxxxxxxxxxx
Cc: stable@xxxxxxxxxxxxxxx
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: Yazen Ghannam <Yazen.Ghannam@xxxxxxx>
Link: https://lkml.kernel.org/r/20181127101700.2964-1-zajec5@xxxxxxxxx
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201291
---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index dd33c357548f..e12454e21b8a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -56,7 +56,7 @@
/* Threshold LVT offset is at MSR0xC0000410[15:12] */
#define SMCA_THR_LVT_OFF 0xF000

-static bool thresholding_en;
+static bool thresholding_irq_en;

static const char * const th_names[] = {
"load_store",
@@ -534,9 +534,8 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,

set_offset:
offset = setup_APIC_mce_threshold(offset, new);
-
- if ((offset == new) && (mce_threshold_vector != amd_threshold_interrupt))
- mce_threshold_vector = amd_threshold_interrupt;
+ if (offset == new)
+ thresholding_irq_en = true;

done:
mce_threshold_block_init(&b, offset);
@@ -1357,9 +1356,6 @@ int mce_threshold_remove_device(unsigned int cpu)
{
unsigned int bank;

- if (!thresholding_en)
- return 0;
-
for (bank = 0; bank < mca_cfg.banks; ++bank) {
if (!(per_cpu(bank_map, cpu) & (1 << bank)))
continue;
@@ -1377,9 +1373,6 @@ int mce_threshold_create_device(unsigned int cpu)
struct threshold_bank **bp;
int err = 0;

- if (!thresholding_en)
- return 0;
-
bp = per_cpu(threshold_banks, cpu);
if (bp)
return 0;
@@ -1408,9 +1401,6 @@ static __init int threshold_init_device(void)
{
unsigned lcpu = 0;

- if (mce_threshold_vector == amd_threshold_interrupt)
- thresholding_en = true;
-
/* to hit CPUs online before the notifier is up */
for_each_online_cpu(lcpu) {
int err = mce_threshold_create_device(lcpu);
@@ -1419,6 +1409,9 @@ static __init int threshold_init_device(void)
return err;
}

+ if (thresholding_irq_en)
+ mce_threshold_vector = amd_threshold_interrupt;
+
return 0;
}
/*