Re: [PATCH v2 4/5] watchdog/hardlockup: improve buddy system detection timeliness
From: Petr Mladek
Date: Mon Mar 23 2026 - 13:49:06 EST
On Thu 2026-03-12 16:22:05, Mayank Rungta via B4 Relay wrote:
> From: Mayank Rungta <mrungta@xxxxxxxxxx>
>
> Currently, the buddy system only performs checks every 3rd sample. With
> a 4-second interval. If a check window is missed, the next check occurs
> 12 seconds later, potentially delaying hard lockup detection for up to
> 24 seconds.
>
> Modify the buddy system to perform checks at every interval (4s).
> Introduce a missed-interrupt threshold to maintain the existing grace
> period while reducing the detection window to 8-12 seconds.
>
> Best and worst case detection scenarios:
>
> Before (12s check window):
> - Best case: Lockup occurs after first check but just before heartbeat
> interval. Detected in ~8s (8s till next check).
> - Worst case: Lockup occurs just after a check.
> Detected in ~24s (missed check + 12s till next check + 12s logic).
>
> After (4s check window with threshold of 3):
> - Best case: Lockup occurs just before a check.
> Detected in ~8s (0s till 1st check + 4s till 2nd + 4s till 3rd).
> - Worst case: Lockup occurs just after a check.
> Detected in ~12s (4s till 1st check + 4s till 2nd + 4s till 3rd).
>
> Reviewed-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> Signed-off-by: Mayank Rungta <mrungta@xxxxxxxxxx>
LGTM:
Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>
Best Regards,
Petr