[PATCH 8/8] x86/mce: Fix thermal throttling reporting after kexec

From: Borislav Petkov
Date: Mon Oct 19 2015 - 05:18:07 EST


From: Andi Kleen <ak@xxxxxxxxxxxxxxx>

The per CPU thermal vector init code checks if the thermal vector is
already installed and complains and bails out if it is.

This happens after kexec, as kernel shut down does not clear the thermal
vector APIC register.

This causes two problems:

1. So we always do not fully initialize thermal reports after kexec. The
CPU is still likely initialized, as the previous kernel should have done
it. But we don't set up the software pointer to the thermal vector, so
reporting may end up with a unknown thermal interrupt message.

2. Also it complains for every logical CPU, even though the value is
actually derived from BP only.

The problem is that we end up with one message per CPU, so on larger
systems it becomes very noisy and messes up the otherwise nicely
formatted CPU bootup numbers in the kernel log.

Just remove the check. I checked the code and there's no valid code
paths where the thermal init code for a CPU could be called multiple
times.

Why the kernel does not clean up this value on shutdown:

The thermal monitoring is controlled per logical CPU thread. Normal
shutdown code is just running on one CPU. To disable it we would need a
broadcast NMI to all CPUs on shut down. That's overkill for this. So we
just ignore it after kexec.

Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx>
Cc: x86-ml <x86@xxxxxxxxxx>
Link: http://lkml.kernel.org/r/1444681922-8644-1-git-send-email-andi@xxxxxxxxxxxxxx
Signed-off-by: Borislav Petkov <bp@xxxxxxx>
---
arch/x86/kernel/cpu/mcheck/therm_throt.c | 8 --------
1 file changed, 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index 1af51b1586d7..2c5aaf8c2e2f 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -503,14 +503,6 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
return;
}

- /* Check whether a vector already exists */
- if (h & APIC_VECTOR_MASK) {
- printk(KERN_DEBUG
- "CPU%d: Thermal LVT vector (%#x) already installed\n",
- cpu, (h & APIC_VECTOR_MASK));
- return;
- }
-
/* early Pentium M models use different method for enabling TM2 */
if (cpu_has(c, X86_FEATURE_TM2)) {
if (c->x86 == 6 && (c->x86_model == 9 || c->x86_model == 13)) {
--
2.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/