Re: [外部邮件] Re: [PATCH] hung_task: Panic after fixed number of hung tasks
From: Lance Yang
Date: Sat Sep 27 2025 - 23:29:19 EST
On 2025/9/28 11:19, Li,Rongqing wrote:
-----Original Message-----
From: Lance Yang <lance.yang@xxxxxxxxx>
Sent: 2025年9月27日 10:39
To: Li,Rongqing <lirongqing@xxxxxxxxx>
Cc: linux-doc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; arnd@xxxxxxxx;
joel.granados@xxxxxxxxxx; feng.tang@xxxxxxxxxxxxxxxxx; pauld@xxxxxxxxxx;
kees@xxxxxxxxxx; rostedt@xxxxxxxxxxx; pawan.kumar.gupta@xxxxxxxxxxxxxxx;
akpm@xxxxxxxxxxxxxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; mingo@xxxxxxxxxx;
paulmck@xxxxxxxxxx; corbet@xxxxxxx; mhiramat@xxxxxxxxxx
Subject: [外部邮件] Re: [PATCH] hung_task: Panic after fixed number of hung
tasks
On 2025/9/25 14:06, lirongqing wrote:
From: Li RongQing <lirongqing@xxxxxxxxx>expected.
Currently, when hung_task_panic is enabled, kernel will panic
immediately upon detecting the first hung task. However, some hung
tasks are transient and the system can recover fully, while others are
unrecoverable and trigger consecutive hung task reports, and a panic is
``CONFIG_DETECT_HUNG_TASK`` is enabled.
This commit adds a new sysctl parameter hung_task_count_to_panic to
allows specifying the number of consecutive hung tasks that must be
detected before triggering a kernel panic. This provides finer control
for environments where transient hangs maybe happen but persistent
hangs should still be fatal.
Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
---
Documentation/admin-guide/sysctl/kernel.rst | 6 ++++++
kernel/hung_task.c | 14 +++++++++++++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/sysctl/kernel.rst
b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab..4240e7b 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -405,6 +405,12 @@ This file shows up if
1 Panic immediately.
= =================================================
+hung_task_count_to_panic
+=====================
+
+When set to a non-zero value, after the number of consecutive hung
+task occur, the kernel will triggers a panic
Hmm... the documentation here seems a bit misleading.
hung_task_panic=1 will always cause an immediate panic, regardless of the
hung_task_count_to_panic setting, right?
Perhaps something like this would be more accurate?
```
hung_task_count_to_panic
========================
When set to a non-zero value, a kernel panic will be triggered if the number of
detected hung tasks reaches this value.
Note that setting hung_task_panic=1 will still cause an immediate panic on the
first hung task, overriding this setting.
```
sysctl_hung_task_all_cpu_backtrace;
hung_task_check_count
=====================
diff --git a/kernel/hung_task.c b/kernel/hung_task.c index
8708a12..87a6421 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -83,6 +83,8 @@ static unsigned int __read_mostly
static unsigned int __read_mostly sysctl_hung_task_panic =
IS_ENABLED(CONFIG_BOOTPARAM_HUNG_TASK_PANIC);
+static unsigned int __read_mostly sysctl_hung_task_count_to_panic;
Nit: while static variables are guaranteed to be zero-initialized, it's a good
practice and clearer for readers to initialize them explicitly.
static unsigned int __read_mostly sysctl_hung_task_count_to_panic = 0;
./scripts/checkpatch.pl reports error when initialise statics to 0, so I will keep it uninitialized
ERROR: do not initialise statics to 0
#51: FILE: kernel/hung_task.c:86:
+static unsigned int __read_mostly sysctl_hung_task_count_to_panic = 0;
Ah, good spot! Let’s leave it as is ;)
Cheers,
Lance
thanks
-Li
Otherwise, this patch looks good to me!
Acked-by: Lance Yang <lance.yang@xxxxxxxxx>
+
static int
hung_task_panic(struct notifier_block *this, unsigned long event, void *ptr)
{
@@ -219,7 +221,9 @@ static void check_hung_task(struct task_struct *t,
unsigned long timeout)
trace_sched_process_hang(t);
- if (sysctl_hung_task_panic) {
+ if (sysctl_hung_task_panic ||
+ (sysctl_hung_task_count_to_panic &&
+ (sysctl_hung_task_detect_count >=
+sysctl_hung_task_count_to_panic))) {
console_verbose();
hung_task_show_lock = true;
hung_task_call_panic = true;
@@ -388,6 +392,14 @@ static const struct ctl_table hung_task_sysctls[] = {
.extra2 = SYSCTL_ONE,
},
{
+ .procname = "hung_task_count_to_panic",
+ .data = &sysctl_hung_task_count_to_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ },
+ {
.procname = "hung_task_check_count",
.data = &sysctl_hung_task_check_count,
.maxlen = sizeof(int),