Re: [v6 PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector
From: Lance Yang
Date: Wed Jan 14 2026 - 22:24:22 EST
On 2026/1/15 10:32, Aaron Tomlin wrote:
Hi Lance, Greg, Petr, Joel, Andrew,
This series introduces the ability to reset
/proc/sys/kernel/hung_task_detect_count.
Writing a "0" value to this file atomically resets the counter of detected
hung tasks. This functionality provides system administrators with the
means to clear the cumulative diagnostic history following incident
resolution, thereby simplifying subsequent monitoring without necessitating
a system restart.
The updated logic ensures that the long-running scan (which is inherently
preemptible and subject to rcu_lock_break()) does not become desynchronised
from the global state. By treating the initial read as a "version snapshot"
the kernel can guarantee that the cumulative count only updates if the
underlying state remained stable throughout the duration of the
scan.
Please let me know your thoughts.
There is a mismatch here with what Joel and Petr suggested ...
IIUC, we should just do:
- Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only
- Patch 2: Add write handler for userspace reset
That way Patch 1 is the real logic change, and Patch 2 is just adding
the userspace interface.
Thanks,
Lance