Re: [PATCH v3] hung_task: deduplicate identical hang reports

From: Aaron Tomlin

Date: Sun Jun 28 2026 - 16:09:31 EST

On Mon, Jun 22, 2026 at 09:18:47AM +0900, Masami Hiramatsu wrote:
> On Sun, 21 Jun 2026 17:37:56 -0400
> Aaron Tomlin <atomlin@xxxxxxxxxxx> wrote:
> > 2. Introduce a hung_task_reported bit-field in task_struct. If a task
> > remains hung across multiple intervals, khungtaskd recognises it
> > has already been reported. The bit is safely cleared without
> > locks or atomics the moment the task's context switch counter
> > increments.
> >
> > 3. For duplicate tasks, we still print the single-line
> > "INFO: task ..." message and trigger tracepoint
> > trace_sched_process_hang(). It merely skips calling
> > sched_show_task() and debug_show_blocker(), printing a concise
> > suppression notice instead.
>
> Ah, OK. So if we need more information, we can record it on trace
> ring buffer.

Hi Masami,

Indeed. I have ensured that the tracepoint trace_sched_process_hang is
always unconditionally accessible for each detected hung task.

> > @@ -261,8 +265,12 @@ static void hung_task_info(struct task_struct *t, unsigned long timeout,
> > pr_err(" Blocked by coredump.\n");
> > pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
> > " disables this message.\n");
> > - sched_show_task(t);
> > - debug_show_blocker(t, timeout);
> > + if (!skip_show_task) {
> > + sched_show_task(t);
> > + debug_show_blocker(t, timeout);
> > + } else {
> > + pr_err(" Stack trace suppressed. Already reported or duplicate wchan\n");
>
> Can we show the wchan hash for each task, so that we can see which
> tasks are waiting on the same wchan?

As Petr pointed out in his review [1], using wchan as a deduplication token
is unfortunately too coarse. It risks grouping completely unrelated
addresses that happen to execute through the same generic wait path
(e.g., mutex_lock_slowpath()).

I am pivoting to a deterministic model based on Petr's suggestion [1].

We will conditionally leverage CONFIG_DETECT_HUNG_TASK_BLOCKER and hash the
exact memory address of the "targeted lock" (i.e., t->blocker &
~BLOCKER_TYPE_MASK). This ensures complete precision. Tasks are only
deduplicated if they are waiting on the exact same resource instance.

In version 3 [2], the budget is unconditionally decremented before
skip_show_task is evaluated. Which needlessly decrement the budget for
every duplicated task.

However, printing the "INFO:" header for duplicates without decrementing
the budget causes an infinite loop of ring buffer spam [3] - the
sysctl_hung_task_warnings counter might never reach zero. Total console
suppression for duplicates, while relying on the sched_process_hang
tracepoint for observability, is the only way to strictly guarantee both
ring-buffer safety and warning budget preservation.

Please let me know your thoughts.

[1]: https://lore.kernel.org/lkml/ajlbvjcfpRuMmfaC@xxxxxxxxxxxxxxx/
[2]: https://lore.kernel.org/lkml/20260621213756.43225-1-atomlin@xxxxxxxxxxx/
[3]: https://lore.kernel.org/lkml/20260627205733.90983-1-atomlin@xxxxxxxxxxx/

Kind regards,
--
Aaron Tomlin