[PATCH 0/4] watchdog/hardlockup: Improvements to hardlockup detection and documentation
From: Mayank Rungta via B4 Relay
Date: Thu Feb 12 2026 - 16:12:36 EST
This series addresses limitations in the hardlockup detector implementations
and updates the documentation to reflect actual behavior and recent changes.
The changes are structured as follows:
Hardlockup Detection Improvements (Patches 1 & 3)
=================================================
The hardlockup detector logic relies on updating saved interrupt counts to
determine if the CPU is making progress.
Patch 1 ensures that the saved interrupt count is updated unconditionally
before checking the "touched" flag. This prevents stale comparisons which
can delay detection. This is a logic fix that ensures the detector remains
accurate even when the watchdog is frequently touched.
Patch 3 improves the Buddy detector's timeliness. The current checking
interval (every 3rd sample) causes high variability in detection time (up
to 24s). This patch changes the Buddy detector to check at every hrtimer
interval (4s) with a missed-interrupt threshold of 3, narrowing the
detection window to a consistent 8-12 second range.
Documentation Updates (Patches 2 & 4)
=====================================
The current documentation does not fully capture the variable nature of
detection latency or the details of the Buddy system.
Patch 2 removes the strict "10 seconds" definition of a hardlockup, which
was misleading given the periodic nature of the detector. It adds a
"Detection Overhead" section to the admin guide, using "Best Case" and
"Worst Case" scenarios to illustrate that detection time can vary
significantly (e.g., ~6s to ~20s).
Patch 4 adds a dedicated section for the Buddy detector, which was previously
undocumented. It details the mechanism, the new timing logic, and known
limitations.
Signed-off-by: Mayank Rungta <mrungta@xxxxxxxxxx>
---
Mayank Rungta (4):
watchdog/hardlockup: Always update saved interrupts during check
doc: watchdog: Clarify hardlockup detection timing
watchdog/hardlockup: improve buddy system detection timeliness
doc: watchdog: Document buddy detector
Documentation/admin-guide/lockup-watchdogs.rst | 132 +++++++++++++++++++++----
include/linux/nmi.h | 1 +
kernel/watchdog.c | 41 ++++++--
kernel/watchdog_buddy.c | 9 +-
4 files changed, 146 insertions(+), 37 deletions(-)
---
base-commit: 0dddf20b4fd4afd59767acc144ad4da60259f21f
change-id: 20260211-hardlockup-watchdog-fixes-60317598ac20
Best regards,
--
Mayank Rungta <mrungta@xxxxxxxxxx>