[RFC PATCH v2 12/14] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI

From: Ricardo Neri
Date: Wed Feb 27 2019 - 11:06:08 EST


The only direct method to determine whether an HPET timer caused an
interrupt is to read the Interrupt Status register. Unfortunately,
reading HPET registers is slow and, therefore, it is not recommended to
read them while in NMI context. Furthermore, status is not available if
the interrupt is generated vi the Front Side Bus.

An indirect manner is to compute the expected value of the the time-stamp
counter and, at the time of the interrupt and verify that its actual
value is within a range of the expected value. Since the hardlockup
detector operates in seconds, high precision is not needed. This
implementation considers that the HPET caused the HMI if the time-stamp
counter reads the expected value -/+ 1.5%. This value is selected is it
is equivalent to 1/64 and the division can be performed using bit
shifts. Experimentally, the error in the estimation is consistently less
than 1%.

Also, only read the time-stamp counter of the handling CPU (the one
targeted by the HPET timer). This helps to avoid variability of the time
stamp across CPUs.

Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Ashok Raj <ashok.raj@xxxxxxxxx>
Cc: Andi Kleen <andi.kleen@xxxxxxxxx>
Cc: Tony Luck <tony.luck@xxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Clemens Ladisch <clemens@xxxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Philippe Ombredanne <pombredanne@xxxxxxxx>
Cc: Kate Stewart <kstewart@xxxxxxxxxxxxxxxxxxx>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
Cc: Mimi Zohar <zohar@xxxxxxxxxxxxx>
Cc: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Cc: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
Cc: Nayna Jain <nayna@xxxxxxxxxxxxx>
Cc: "Ravi V. Shankar" <ravi.v.shankar@xxxxxxxxx>
Cc: x86@xxxxxxxxxx
Suggested-by: Andi Kleen <andi.kleen@xxxxxxxxx>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
---
arch/x86/include/asm/hpet.h | 2 ++
arch/x86/kernel/watchdog_hld_hpet.c | 28 +++++++++++++++++++++++++---
2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 15dc3b576496..09763340c911 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -123,6 +123,8 @@ struct hpet_hld_data {
u32 num;
u32 flags;
u64 ticks_per_second;
+ u64 tsc_next;
+ u64 tsc_next_error;
u32 handling_cpu;
struct cpumask cpu_monitored_mask;
struct msi_msg msi_msg;
diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog_hld_hpet.c
index cfa284da4bf6..65b4699f249a 100644
--- a/arch/x86/kernel/watchdog_hld_hpet.c
+++ b/arch/x86/kernel/watchdog_hld_hpet.c
@@ -55,6 +55,11 @@ static inline void set_comparator(struct hpet_hld_data *hdata,
*
* Reprogram the timer to expire within watchdog_thresh seconds in the future.
*
+ * Also compute the expected value of the time-stamp counter at the time of
+ * expiration as well as a deviation from the expected value. The maximum
+ * deviation is of ~1.5%. This deviation can be easily computed by shifting
+ * by 6 positions the delta between the current and expected time-stamp values.
+ *
* Returns:
*
* None
@@ -62,7 +67,18 @@ static inline void set_comparator(struct hpet_hld_data *hdata,
static void kick_timer(struct hpet_hld_data *hdata, bool force)
{
bool kick_needed = force || !(hdata->flags & HPET_DEV_PERI_CAP);
- unsigned long new_compare, count;
+ unsigned long tsc_curr, tsc_delta, new_compare, count;
+
+ /* Start obtaining the current TSC and HPET counts. */
+ tsc_curr = rdtsc();
+
+ if (kick_needed)
+ count = get_count();
+
+ tsc_delta = (unsigned long)watchdog_thresh * (unsigned long)tsc_khz
+ * 1000L;
+ hdata->tsc_next = tsc_curr + tsc_delta;
+ hdata->tsc_next_error = tsc_delta >> 6;

/*
* Update the comparator in increments of watch_thresh seconds relative
@@ -74,8 +90,6 @@ static void kick_timer(struct hpet_hld_data *hdata, bool force)
*/

if (kick_needed) {
- count = get_count();
-
new_compare = count + watchdog_thresh * hdata->ticks_per_second;

set_comparator(hdata, new_compare);
@@ -147,6 +161,14 @@ static void set_periodic(struct hpet_hld_data *hdata)
*/
static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata)
{
+ if (smp_processor_id() == hdata->handling_cpu) {
+ unsigned long tsc_curr;
+
+ tsc_curr = rdtsc();
+ if (abs(tsc_curr - hdata->tsc_next) < hdata->tsc_next_error)
+ return true;
+ }
+
return false;
}

--
2.17.1