Re: [v7 PATCH 1/2] hung_task: Refactor detection logic and atomicise detection count

From: Lance Yang

Date: Tue Feb 03 2026 - 06:01:22 EST

On 2026/2/3 17:03, Petr Mladek wrote:

On Tue 2026-02-03 11:08:33, Lance Yang wrote:

On 2026/2/3 11:05, Lance Yang wrote:

On 2026/1/25 21:58, Aaron Tomlin wrote:

The check_hung_task() function currently conflates two distinct
responsibilities: validating whether a task is hung and handling the
subsequent reporting (printing warnings, triggering panics, or
tracepoints).

This patch refactors the logic by introducing hung_task_info(), a
function dedicated solely to reporting. The actual detection check,
task_is_hung(), is hoisted into the primary loop within
check_hung_uninterruptible_tasks(). This separation clearly decouples
the mechanism of detection from the policy of reporting.

Furthermore, to facilitate future support for concurrent hung task
detection, the global sysctl_hung_task_detect_count variable is
converted from unsigned long to atomic_long_t. Consequently, the
counting logic is updated to accumulate the number of hung tasks locally
(this_round_count) during the iteration. The global counter is then
updated atomically via atomic_long_cmpxchg_relaxed() once the loop
concludes, rather than incrementally during the scan.

These changes are strictly preparatory and introduce no functional
change to the system's runtime behaviour.

Signed-off-by: Aaron Tomlin <atomlin@xxxxxxxxxxx>
---
kernel/hung_task.c | 58 ++++++++++++++++++++++++++--------------------
1 file changed, 33 insertions(+), 25 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d2254c91450b..df10830ed9ef 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -36,7 +36,7 @@ static int __read_mostly
sysctl_hung_task_check_count = PID_MAX_LIMIT;
/*
   * Total number of tasks detected as hung since boot:
   */
-static unsigned long __read_mostly sysctl_hung_task_detect_count;
+static atomic_long_t sysctl_hung_task_detect_count =
ATOMIC_LONG_INIT(0);
/*
   * Limit number of tasks checked in a batch.
@@ -223,31 +223,29 @@ static inline void debug_show_blocker(struct
task_struct *task, unsigned long ti
}
#endif
-static void check_hung_task(struct task_struct *t, unsigned long
timeout,
-        unsigned long prev_detect_count)
+/**
+ * hung_task_info - Print diagnostic details for a hung task
+ * @t: Pointer to the detected hung task.
+ * @timeout: Timeout threshold for detecting hung tasks
+ * @this_round_count: Count of hung tasks detected in the current
iteration
+ *
+ * Print structured information about the specified hung task, if
warnings
+ * are enabled or if the panic batch threshold is exceeded.
+ */
+static void hung_task_info(struct task_struct *t, unsigned long timeout,
+               unsigned long this_round_count)
{
-    unsigned long total_hung_task;
-
-    if (!task_is_hung(t, timeout))
-        return;
-
-    /*
-     * This counter tracks the total number of tasks detected as hung
-     * since boot.
-     */
-    sysctl_hung_task_detect_count++;

Previously, the global detect count updated immediately when a hung task
was found. BUT now, it only updates after the full scan finishes ...

Ideally, the count should update as soon as possible, so that userspace
can react in time :)

For example, by migrating critical containers away from the node before
the situation gets worse - something we already do.

Sorry, I should have said that earlier - just realized it ...

Better late then sorry ;-)

That said, is the delay really critical? I guess that the userspace
checks the counter in regular intervals (seconds or tens of seconds).
Or is there any way to get a notification immediately?

Just rely on polling the counter every 0.x seconds.

I don't think that the full scan would take many seconds, but reporting
(e.g. pr_err) could be slow somehow ...

Anyway, I thought how the counting and barriers might work when
we update the global counter immediately. And I came up with
the following:

Nice! That should be doing the right thing.

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 350093de0535..8bc043fbe89c 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -302,15 +302,10 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
int max_count = sysctl_hung_task_check_count;
unsigned long last_break = jiffies;
struct task_struct *g, *t;
- unsigned long total_count, this_round_count;
+ unsigned long this_round_count;
int need_warning = sysctl_hung_task_warnings;
unsigned long si_mask = hung_task_si_mask;
- /*
- * The counter might get reset. Remember the initial value.
- * Acquire prevents reordering task checks before this point.
- */
- total_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count);
/*
* If the system crashed already then all bets are off,
* do not report extra hung tasks:
@@ -330,6 +325,13 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
}
if (task_is_hung(t, timeout)) {
+ /*
+ * Increment the global counter so that userspace could
+ * start migrating tasks ASAP. But count the current
+ * round separately because userspace could reset
+ * the global counter at any time.
+ */
+ atomic_long_inc(&sysctl_hung_task_detect_count);

Atomic increment with relaxed ordering, which is good enough and works well, IIUC.

this_round_count++;
hung_task_info(t, timeout, this_round_count);
}
@@ -340,15 +342,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
if (!this_round_count)
return;
- /*
- * Do not count this round when the global counter has been reset
- * during this check. Release ensures we see all hang details
- * recorded during the scan.
- */
- atomic_long_cmpxchg_release(&sysctl_hung_task_detect_count,
- total_count, total_count +
- this_round_count);
-
if (need_warning || hung_task_call_panic) {
si_mask |= SYS_INFO_LOCKS;

I am not sure of the comment above the increment is needed.
Well, it might help anyone to understand the motivation without
digging in the git log history.

Looks good to me. Could you send it as a follow-up patch?

Cheers,
Lance