[PATCH 4/4] doc: watchdog: Document buddy detector

From: Mayank Rungta via B4 Relay

Date: Thu Feb 12 2026 - 16:12:51 EST

From: Mayank Rungta <mrungta@xxxxxxxxxx>

The current documentation generalizes the hardlockup detector as primarily
NMI-perf-based and lacks details on the SMP "Buddy" detector.

Update the documentation to add a detailed description of the Buddy
detector, and also restructure the "Implementation" section to explicitly
separate "Softlockup Detector", "Hardlockup Detector (NMI/Perf)", and
"Hardlockup Detector (Buddy)".

Clarify that the softlockup hrtimer acts as the heartbeat generator for
both hardlockup mechanisms and centralize the configuration details in a
"Frequency and Heartbeats" section.

Signed-off-by: Mayank Rungta <mrungta@xxxxxxxxxx>
---
Documentation/admin-guide/lockup-watchdogs.rst | 149 +++++++++++++++++--------
1 file changed, 101 insertions(+), 48 deletions(-)

diff --git a/Documentation/admin-guide/lockup-watchdogs.rst b/Documentation/admin-guide/lockup-watchdogs.rst
index 1b374053771f676d874716b3210cade55ae89b28..7ae7ce3abd2c838ff29c70f7a32ffaf58531e150 100644
--- a/Documentation/admin-guide/lockup-watchdogs.rst
+++ b/Documentation/admin-guide/lockup-watchdogs.rst
@@ -30,22 +30,23 @@ timeout is set through the confusingly named "kernel.panic" sysctl),
to cause the system to reboot automatically after a specified amount
of time.

+Configuration
+=============
+
+A kernel knob is provided that allows administrators to configure
+this period. The "watchdog_thresh" parameter (default 10 seconds)
+controls the threshold. The right value for a particular environment
+is a trade-off between fast response to lockups and detection overhead.
+
Implementation
==============

-The soft and hard lockup detectors are built on top of the hrtimer and
-perf subsystems, respectively. A direct consequence of this is that,
-in principle, they should work in any architecture where these
-subsystems are present.
+The soft lockup detector is built on top of the hrtimer subsystem.
+The hard lockup detector is built on top of the perf subsystem
+(on architectures that support it) or uses an SMP "buddy" system.

-A periodic hrtimer runs to generate interrupts and kick the watchdog
-job. An NMI perf event is generated every "watchdog_thresh"
-(compile-time initialized to 10 and configurable through sysctl of the
-same name) seconds to check for hardlockups. If any CPU in the system
-does not receive any hrtimer interrupt during that time the
-'hardlockup detector' (the handler for the NMI perf event) will
-generate a kernel warning or call panic, depending on the
-configuration.
+Softlockup Detector
+-------------------

The watchdog job runs in a stop scheduling thread that updates a
timestamp every time it is scheduled. If that timestamp is not updated
@@ -55,53 +56,105 @@ will dump useful debug information to the system log, after which it
will call panic if it was instructed to do so or resume execution of
other kernel code.

-The period of the hrtimer is 2*watchdog_thresh/5, which means it has
-two or three chances to generate an interrupt before the hardlockup
-detector kicks in.
+Frequency and Heartbeats
+------------------------
+
+The hrtimer used by the softlockup detector serves a dual purpose:
+it detects softlockups, and it also generates the interrupts
+(heartbeats) that the hardlockup detectors use to verify CPU liveness.
+
+The period of this hrtimer is 2*watchdog_thresh/5. This means the
+hrtimer has two or three chances to generate an interrupt before the
+NMI hardlockup detector kicks in.
+
+Hardlockup Detector (NMI/Perf)
+------------------------------
+
+On architectures that support NMI (Non-Maskable Interrupt) perf events,
+a periodic NMI is generated every "watchdog_thresh" seconds.
+
+If any CPU in the system does not receive any hrtimer interrupt
+(heartbeat) during the "watchdog_thresh" window, the 'hardlockup
+detector' (the handler for the NMI perf event) will generate a kernel
+warning or call panic.
+
+**Detection Overhead (NMI):**
+
+The time to detect a lockup can vary depending on when the lockup
+occurs relative to the NMI check window. Examples below assume a watchdog_thresh of 10.
+
+* **Best Case:** The lockup occurs just before the first heartbeat is
+ due. The detector will notice the missing hrtimer interrupt almost
+ immediately during the next check.
+
+ ::
+
+ Time 100.0: cpu 1 heartbeat
+ Time 100.1: hardlockup_check, cpu1 stores its state
+ Time 103.9: Hard Lockup on cpu1
+ Time 104.0: cpu 1 heartbeat never comes
+ Time 110.1: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup
+
+ Time to detection: ~6 seconds
+
+* **Worst Case:** The lockup occurs shortly after a valid interrupt
+ (heartbeat) which itself happened just after the NMI check. The next
+ NMI check sees that the interrupt count has changed (due to that one
+ heartbeat), assumes the CPU is healthy, and resets the baseline. The
+ lockup is only detected at the subsequent check.
+
+ ::
+
+ Time 100.0: hardlockup_check, cpu1 stores its state
+ Time 100.1: cpu 1 heartbeat
+ Time 100.2: Hard Lockup on cpu1
+ Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as state changed)
+ Time 120.0: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup

-As explained above, a kernel knob is provided that allows
-administrators to configure the period of the hrtimer and the perf
-event. The right value for a particular environment is a trade-off
-between fast response to lockups and detection overhead.
+ Time to detection: ~20 seconds

-Detection Overhead
-------------------
+Hardlockup Detector (Buddy)
+---------------------------

-The hardlockup detector checks for lockups using a periodic NMI perf
-event. This means the time to detect a lockup can vary depending on
-when the lockup occurs relative to the NMI check window.
+On architectures or configurations where NMI perf events are not
+available (or disabled), the kernel may use the "buddy" hardlockup
+detector. This mechanism requires SMP (Symmetric Multi-Processing).

-**Best Case:**
-In the best case scenario, the lockup occurs just before the first
-heartbeat is due. The detector will notice the missing hrtimer
-interrupt almost immediately during the next check.
+In this mode, each CPU is assigned a "buddy" CPU to monitor. The
+monitoring CPU runs its own hrtimer (the same one used for softlockup
+detection) and checks if the buddy CPU's hrtimer interrupt count has
+increased.

-::
+To ensure timeliness and avoid false positives, the buddy system performs
+checks at every hrtimer interval (2*watchdog_thresh/5, which is 4 seconds
+by default). It uses a missed-interrupt threshold of 3. If the buddy's
+interrupt count has not changed for 3 consecutive checks, it is assumed
+that the buddy CPU is hardlocked (interrupts disabled). The monitoring
+CPU will then trigger the hardlockup response (warning or panic).

- Time 100.0: cpu 1 heartbeat
- Time 100.1: hardlockup_check, cpu1 stores its state
- Time 103.9: Hard Lockup on cpu1
- Time 104.0: cpu 1 heartbeat never comes
- Time 110.1: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup
+**Detection Overhead (Buddy):**

- Time to detection: ~6 seconds
+With a default check interval of 4 seconds (watchdog_thresh = 10):

-**Worst Case:**
-In the worst case scenario, the lockup occurs shortly after a valid
-interrupt (heartbeat) which itself happened just after the NMI check.
-The next NMI check sees that the interrupt count has changed (due to
-that one heartbeat), assumes the CPU is healthy, and resets the
-baseline. The lockup is only detected at the subsequent check.
+* **Best case:** Lockup occurs just before a check.
+ Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd).
+* **Worst case:** Lockup occurs just after a check.
+ Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd).

-::
+**Limitations of the Buddy Detector:**

- Time 100.0: hardlockup_check, cpu1 stores its state
- Time 100.1: cpu 1 heartbeat
- Time 100.2: Hard Lockup on cpu1
- Time 110.0: hardlockup_check, cpu1 stores its state (misses lockup as state changed)
- Time 120.0: hardlockup_check, cpu1 checks the state again, should be the same, declares lockup
+1. **All-CPU Lockup:** If all CPUs lock up simultaneously, the buddy
+ detector cannot detect the condition because the monitoring CPUs
+ are also frozen.
+2. **Stack Traces:** Unlike the NMI detector, the buddy detector
+ cannot directly interrupt the locked CPU to grab a stack trace.
+ It relies on architecture-specific mechanisms (like NMI backtrace
+ support) to try and retrieve the status of the locked CPU. If
+ such support is missing, the log may only show that a lockup
+ occurred without providing the locked CPU's stack.

- Time to detection: ~20 seconds
+Watchdog Core Exclusion
+=======================

By default, the watchdog runs on all online cores. However, on a
kernel configured with NO_HZ_FULL, by default the watchdog runs only

--
2.53.0.273.g2a3d683680-goog