Re: [PATCH] hung_task: Add per-round stack trace deduplication

From: Aaron Tomlin

Date: Fri Jun 19 2026 - 18:06:52 EST


On Wed, Jun 17, 2026 at 11:04:55PM +0100, David Laight wrote:
> On Wed, 17 Jun 2026 14:48:41 -0400
> Aaron Tomlin <atomlin@xxxxxxxxxxx> wrote:
>
> > Currently, when multiple tasks hang in the exact same location (e.g.,
> > such as severe contention for a mutex), khungtaskd indiscriminately
> > reports every single instance. This wastes ring buffer space with
> > identical stack traces up to the defined warning limit (i.e.,
> > kernel.hung_task_warnings), obscuring the root cause without providing
> > any additional diagnostic value.
> >
> > Introduce a lightweight, hash-based stack trace deduplicator for
> > khungtaskd to ensure only unique stack traces are reported during
> > a single detection interval.
>
> How many different stacks do you need to suppress?
> Mostly wont it be 'the same as the last one'?
> So just a linear scan through a very small number of entries will
> largely DTRT.
> Much simpler code and a much smaller data footprint.

Hi David,

Thank you for your review and the constructive feedback.

If a system experiences severe contention upon a single lock, you are
entirely correct: a straightforward "last seen" check or a diminutive
linear array of recent stacks would serve the intended purpose rather well,
keeping the memory footprint to an absolute minimum.

However, the hash table approach was deliberately chosen to accommodate
severe cascading failures wherein multiple distinct locks are contended
simultaneously. In such scenarios, the hung tasks evaluated by the watchdog
are frequently heavily interleaved within the task list (e.g., Lock A, Lock
B, Lock C, Lock A, Lock B).

A small linear array would invariably thrash and overwrite itself in this
state, reverting to inundating the ring buffer once more. By contrast, the
12 bit static hash table ensures deterministic, O(1) deduplication
regardless of how heavily the hung tasks interleave. We consider this
paramount during a system wide lock storm.

Whilst it does incur a cost of 16 KB of static memory, this appears to be a
prudent and justifiable compromise to ensure we do not forfeit the
deduplication benefits during the most catastrophic crashes.

Furthermore, based on additional feedback, the forthcoming v2 patch
significantly optimises the execution overhead by bypassing the stack
unwinding entirely if warnings are disabled or the sysctl is toggled off.
It also properly guards the logic for CONFIG_STACKTRACE=n builds.

Please do let me know if this rationale for retaining the hash table seems
reasonable to you.


Kind regards,
--
Aaron Tomlin