Re: [v4 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count
From: Lance Yang
Date: Sun Dec 21 2025 - 21:21:46 EST
On 2025/12/22 09:42, Aaron Tomlin wrote:
Introduce support for writing to /proc/sys/kernel/hung_task_detect_count.
Writing a value of zero to this file atomically resets the counter of
detected hung tasks. This grants system administrators the ability to
clear the cumulative diagnostic history after resolving an incident,
simplifying monitoring without requiring a system restart.
Signed-off-by: Aaron Tomlin <atomlin@xxxxxxxxxxx>
---
Documentation/admin-guide/sysctl/kernel.rst | 3 +-
kernel/hung_task.c | 76 ++++++++++++++++++---
2 files changed, 67 insertions(+), 12 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 239da22c4e28..68da4235225a 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -418,7 +418,8 @@ hung_task_detect_count
======================
Indicates the total number of tasks that have been detected as hung since
-the system boot.
+the system boot or since the counter was reset. The counter is zeroed when
+a value of 0 is written.
This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 00c3296fd692..70b3db047f5d 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -17,6 +17,7 @@
#include <linux/export.h>
#include <linux/panic_notifier.h>
#include <linux/sysctl.h>
+#include <linux/atomic.h>
#include <linux/suspend.h>
#include <linux/utsname.h>
#include <linux/sched/signal.h>
@@ -36,7 +37,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
/*
* Total number of tasks detected as hung since boot:
*/
-static unsigned long __read_mostly sysctl_hung_task_detect_count;
+static atomic_long_t sysctl_hung_task_detect_count = ATOMIC_LONG_INIT(0);
/*
* Limit number of tasks checked in a batch.
@@ -246,20 +247,26 @@ static inline void hung_task_diagnostics(struct task_struct *t)
}
static void check_hung_task(struct task_struct *t, unsigned long timeout,
- unsigned long prev_detect_count)
+ unsigned long prev_detect_count)
{
- unsigned long total_hung_task;
+ unsigned long total_hung_task, current_detect;
if (!task_is_hung(t, timeout))
return;
/*
* This counter tracks the total number of tasks detected as hung
- * since boot.
+ * since boot. If a reset occurred during the scan, we treat the
+ * current count as the new delta to avoid an underflow error.
+ * Ensure hang details are globally visible before the counter
+ * update.
*/
- sysctl_hung_task_detect_count++;
+ current_detect = atomic_long_inc_return_acquire(&sysctl_hung_task_detect_count);
+ if (current_detect >= prev_detect_count)
+ total_hung_task = current_detect - prev_detect_count;
+ else
+ total_hung_task = current_detect;
- total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
trace_sched_process_hang(t);
if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
@@ -318,7 +325,8 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
int max_count = sysctl_hung_task_check_count;
unsigned long last_break = jiffies;
struct task_struct *g, *t;
- unsigned long prev_detect_count = sysctl_hung_task_detect_count;
+ /* Acquire prevents reordering task checks before this point. */
+ unsigned long prev_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count);
int need_warning = sysctl_hung_task_warnings;
unsigned long si_mask = hung_task_si_mask;
@@ -346,7 +354,9 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
unlock:
rcu_read_unlock();
- if (!(sysctl_hung_task_detect_count - prev_detect_count))
+ /* Ensures we see all hang details recorded during the scan. */
+ if (!(atomic_long_read_acquire(&sysctl_hung_task_detect_count) -
+ prev_detect_count))
return;
Hmm, I think we're missing the same underflow check here ...
If reset happens mid-scan, it can also underflow and cause
false positives in the diagnostics :)
we should apply the same "if (current < prev) use current" logic
here as Petr metioned before.
[...]
Cheers,
Lance