Re: [PATCH] x86/thermal: Fix LVT thermal setup for SMI delivery mode

From: Srinivas Pandruvada
Date: Thu May 27 2021 - 14:10:09 EST


On Thu, 2021-05-27 at 12:31 +0200, Borislav Petkov wrote:
> Ok,
>
> it took me a while to find a box like yours to reproduce on. Anyway,
> here's what looks like the final fix, you could give it a run.
>
> Thx.
>
> ---
> From: Borislav Petkov <bp@xxxxxxx>
> Date: Thu, 27 May 2021 11:02:26 +0200
>
> There are machines out there with added value crap^WBIOS which
> provide an
> SMI handler for the local APIC thermal sensor interrupt. Out of
> reset,
> the BSP on those machines has something like 0x200 in that APIC
> register
> (timestamps left in because this whole issue is timing sensitive):
>
> [ 0.033858] read lvtthmr: 0x330, val: 0x200
>
> which means:
>
> - bit 16 - the interrupt mask bit is clear and thus that interrupt
> is enabled
> - bits [10:8] have 010b which means SMI delivery mode.
>
> Now, later during boot, when the kernel programs the local APIC, it
> soft-disables it temporarily through the spurious vector register:
>
> setup_local_APIC:
>
> ...
>
> /*
> * If this comes from kexec/kcrash the APIC might be enabled in
> * SPIV. Soft disable it before doing further initialization.
> */
> value = apic_read(APIC_SPIV);
> value &= ~APIC_SPIV_APIC_ENABLED;
> apic_write(APIC_SPIV, value);
>
> which means (from the SDM):
>
> "10.4.7.2 Local APIC State After It Has Been Software Disabled
>
> ...
>
> * The mask bits for all the LVT entries are set. Attempts to reset
> these
> bits will be ignored."
>
> And this happens too:
>
> [ 0.124111] APIC: Switch to symmetric I/O mode setup
> [ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0
> [ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0
>
> This results in CPU 0 soft lockups depending on the placement in time
> when the APIC soft-disable happens. Those soft lockups are not 100%
> reproducible and the reason for that can only be speculated as no one
> tells you what SMM does. Likely, it confuses the SMM code that the
> APIC
> is disabled and the thermal interrupt doesn't doesn't fire at all,

My guess is that system is booting hot sometimes. SMM started fan or
some cooling and set a temperature threshold. It is waiting for thermal
interrupt for temperature threshold, which it never got.


Thanks,
Srinivas


> leading to CPU 0 stuck in SMM forever...
>
> Now, before
>
> 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()")
>
> due to how the APIC_LVTTHMR was read before APIC initialization in
> mcheck_intel_therm_init(), it would read the value with the mask bit
> 16
> clear and then intel_init_thermal() would replicate it onto the APs
> and
> all would be peachy - the thermal interrupt would remain enabled.
>
> But that commit moved that reading to a later moment in
> intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP
> too
> late and with its interrupt mask bit set.
>
> Thus, revert back to the old behavior of reading the thermal LVT
> register before the APIC gets initialized.
>
> Fixes: 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()")
> Reported-by: James Feeney <james@xxxxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Cc: Zhang Rui <rui.zhang@xxxxxxxxx>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@xxxxxxx
> ---
> arch/x86/include/asm/thermal.h | 4 +++-
> arch/x86/kernel/setup.c | 9 +++++++++
> drivers/thermal/intel/therm_throt.c | 15 +++++++++++----
> 3 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/thermal.h
> b/arch/x86/include/asm/thermal.h
> index ddbdefd5b94f..91a7b6687c3b 100644
> --- a/arch/x86/include/asm/thermal.h
> +++ b/arch/x86/include/asm/thermal.h
> @@ -3,11 +3,13 @@
> #define _ASM_X86_THERMAL_H
>
> #ifdef CONFIG_X86_THERMAL_VECTOR
> +void therm_lvt_init(void);
> void intel_init_thermal(struct cpuinfo_x86 *c);
> bool x86_thermal_enabled(void);
> void intel_thermal_interrupt(void);
> #else
> -static inline void intel_init_thermal(struct cpuinfo_x86 *c) { }
> +static inline void therm_lvt_init(void)
> { }
> +static inline void intel_init_thermal(struct cpuinfo_x86 *c) { }
> #endif
>
> #endif /* _ASM_X86_THERMAL_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 72920af0b3c0..ff653d608d5f 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -44,6 +44,7 @@
> #include <asm/pci-direct.h>
> #include <asm/prom.h>
> #include <asm/proto.h>
> +#include <asm/thermal.h>
> #include <asm/unwind.h>
> #include <asm/vsyscall.h>
> #include <linux/vmalloc.h>
> @@ -1226,6 +1227,14 @@ void __init setup_arch(char **cmdline_p)
>
> x86_init.timers.wallclock_init();
>
> + /*
> + * This needs to run before setup_local_APIC() which soft-
> disables the
> + * local APIC temporarily and that masks the thermal LVT
> interrupt,
> + * leading to softlockups on machines which have configured SMI
> + * interrupt delivery.
> + */
> + therm_lvt_init();
> +
> mcheck_init();
>
> register_refined_jiffies(CLOCK_TICK_RATE);
> diff --git a/drivers/thermal/intel/therm_throt.c
> b/drivers/thermal/intel/therm_throt.c
> index f8e882592ba5..99abdc03c44c 100644
> --- a/drivers/thermal/intel/therm_throt.c
> +++ b/drivers/thermal/intel/therm_throt.c
> @@ -621,6 +621,17 @@ bool x86_thermal_enabled(void)
> return atomic_read(&therm_throt_en);
> }
>
> +void __init therm_lvt_init(void)
> +{
> + /*
> + * This function is only called on boot CPU. Save the init
> thermal
> + * LVT value on BSP and use that value to restore APs' thermal
> LVT
> + * entry BIOS programmed later
> + */
> + if (intel_thermal_supported(&boot_cpu_data))
> + lvtthmr_init = apic_read(APIC_LVTTHMR);
> +}
> +
> void intel_init_thermal(struct cpuinfo_x86 *c)
> {
> unsigned int cpu = smp_processor_id();
> @@ -630,10 +641,6 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
> if (!intel_thermal_supported(c))
> return;
>
> - /* On the BSP? */
> - if (c == &boot_cpu_data)
> - lvtthmr_init = apic_read(APIC_LVTTHMR);
> -
> /*
> * First check if its enabled already, in which case there
> might
> * be some SMM goo which handles it, so we can't even put a
> handler
> --
> 2.29.2
>
>