Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung

From: Lance Yang
Date: Sun Nov 16 2025 - 02:59:12 EST




On 2025/11/13 19:10, Feng Tang wrote:
When task-hung happens, developers may need different kinds of system
information (call-stacks, memory info, locks, etc.) to help debugging.

Add 'hung_task_sys_info' sysctl knob to take human readable string like
"tasks,mem,timers,locks,ftrace,...", and when task-hung happens, all
requested information will be dumped. (refer kernel/sys_info.c for more
details).

Meanwhile, the newly introduced sys_info() call is used to unify some
existing info-dumping knobs.

Suggested-by: Petr Mladek <pmladek@xxxxxxxx>
Signed-off-by: Feng Tang <feng.tang@xxxxxxxxxxxxxxxxx>
---
Documentation/admin-guide/sysctl/kernel.rst | 5 ++
kernel/hung_task.c | 62 +++++++++++++--------
2 files changed, 43 insertions(+), 24 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index a397eeccaea7..45b4408dad31 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst

[...]

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 5ac0e66a1361..5b3a7785d3a2 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -24,6 +24,7 @@
#include <linux/sched/sysctl.h>
#include <linux/hung_task.h>
#include <linux/rwsem.h>
+#include <linux/sys_info.h>
#include <trace/events/sched.h>
@@ -59,12 +60,17 @@ static unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
static int __read_mostly sysctl_hung_task_warnings = 10;
static int __read_mostly did_panic;
-static bool hung_task_show_lock;
static bool hung_task_call_panic;
-static bool hung_task_show_all_bt;
static struct task_struct *watchdog_task;
+/*
+ * A bitmask to control what kinds of system info to be printed when
+ * a hung task is detected, it could be task, memory, lock etc. Refer
+ * include/linux/sys_info.h for detailed bit definition.
+ */
+static unsigned long hung_task_si_mask;
+
#ifdef CONFIG_SMP
/*
* Should we dump all CPUs backtraces in a hung task event?
@@ -217,11 +223,8 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti
}
#endif
-static void check_hung_task(struct task_struct *t, unsigned long timeout,
- unsigned long prev_detect_count)
+static void check_hung_task(struct task_struct *t, unsigned long timeout)
{
- unsigned long total_hung_task;
-
if (!task_is_hung(t, timeout))
return;
@@ -231,20 +234,13 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout,
*/
sysctl_hung_task_detect_count++;
- total_hung_task = sysctl_hung_task_detect_count - prev_detect_count;
trace_sched_process_hang(t);
- if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) {
- console_verbose();
- hung_task_show_lock = true;
- hung_task_call_panic = true;
- }
-
/*
* Ok, the task did not get scheduled for more than 2 minutes,
* complain:
*/
- if (sysctl_hung_task_warnings || hung_task_call_panic) {
+ if (sysctl_hung_task_warnings) {

It seems like the behavior changes when sysctl_hung_task_warnings is
0 but a panic is about to be triggered ...

Looking at the history:

1) Commit ("hung_task: ignore hung_task_warnings when hung_task_panic
is enabled")[1] ensured that hung task information is always dumped
when a panic is configured, even if the warning counter is exhausted.

2) Later, commit ("hung_task: panic when there are more than N hung
tasks at the same time")[2] refined the logic to trigger a panic based
on the number of hung tasks found in a single scan.

To stay consistent with the established behavior, I think we should
continue to dump the information for hung tasks as long as
sysctl_hung_task_panic is enabled :)

[1] https://lore.kernel.org/all/20240613033159.3446265-1-leonylgao@xxxxxxxxx
[2] https://lore.kernel.org/all/20251015063615.2632-1-lirongqing@xxxxxxxxx
[...]

Cheers,
Lance