[PATCH v7 24/24] x86/tsc: Stop the HPET hardlockup detector if TSC become unstable

From: Ricardo Neri
Date: Wed Mar 01 2023 - 18:39:18 EST


The HPET-based hardlockup detector relies on the TSC to determine if an
observed NMI interrupt was originated by HPET timer. Hence, this detector
can no longer be used with an unstable TSC. Once marked as unstable,
the TSC cannot be stable again. In such case, permanently stop the HPET-
based hardlockup detector.

Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Stephane Eranian <eranian@xxxxxxxxxx>
Cc: "Ravi V. Shankar" <ravi.v.shankar@xxxxxxxxx>
Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
Cc: linuxppc-dev@xxxxxxxxxxxxxxxx
Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
---
Changes since v6:
* Do not switch to the perf-based NMI watchdog. Instead, only stop
the HPET-based NMI watchdog if the TSC counter becomes unstable.

Changes since v5:
* Relocated the declaration of hardlockup_detector_switch_to_perf() to
x86/nmi.h It does not depend on HPET.
* Removed function stub. The shim hardlockup detector is always for x86.

Changes since v4:
* Added a stub version of hardlockup_detector_switch_to_perf() for
!CONFIG_HPET_TIMER. (lkp)
* Reconfigure the whole lockup detector instead of unconditionally
starting the perf-based hardlockup detector.

Changes since v3:
* None

Changes since v2:
* Introduced this patch.

Changes since v1:
* N/A
---
arch/x86/include/asm/nmi.h | 6 ++++++
arch/x86/kernel/tsc.c | 3 +++
arch/x86/kernel/watchdog_hld.c | 11 +++++++++++
3 files changed, 20 insertions(+)

diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 5c5f1e56c404..4d0687a2b4ea 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -63,4 +63,10 @@ void stop_nmi(void);
void restart_nmi(void);
void local_touch_nmi(void);

+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+extern void hardlockup_detector_mark_hpet_hld_unavailable(void);
+#else
+static inline void hardlockup_detector_mark_hpet_hld_unavailable(void) {}
+#endif
+
#endif /* _ASM_X86_NMI_H */
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 344698852146..24f77efea569 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1191,6 +1191,9 @@ void mark_tsc_unstable(char *reason)

clocksource_mark_unstable(&clocksource_tsc_early);
clocksource_mark_unstable(&clocksource_tsc);
+
+ /* The HPET hardlockup detector depends on a stable TSC. */
+ hardlockup_detector_mark_hpet_hld_unavailable();
}

EXPORT_SYMBOL_GPL(mark_tsc_unstable);
diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c
index 33c22f6456a3..f5d79ce0e7a2 100644
--- a/arch/x86/kernel/watchdog_hld.c
+++ b/arch/x86/kernel/watchdog_hld.c
@@ -6,6 +6,8 @@
* Copyright (C) Intel Corporation 2023
*/

+#define pr_fmt(fmt) "watchdog: " fmt
+
#include <linux/nmi.h>
#include <asm/hpet.h>

@@ -84,3 +86,12 @@ void watchdog_nmi_start(void)
if (detector_type == X86_HARDLOCKUP_DETECTOR_HPET)
hardlockup_detector_hpet_start();
}
+
+void hardlockup_detector_mark_hpet_hld_unavailable(void)
+{
+ if (detector_type != X86_HARDLOCKUP_DETECTOR_HPET)
+ return;
+
+ pr_warn("TSC is unstable. Stopping the HPET NMI watchdog.");
+ hardlockup_detector_mark_unavailable();
+}
--
2.25.1