Re: [外部邮件] Re: [PATCH][v2] hung_task: Panic after fixed number of hung tasks

From: Lance Yang

Date: Sun Sep 28 2025 - 03:12:12 EST




On 2025/9/28 15:03, Li,Rongqing wrote:
On 2025/9/28 13:31, lirongqing wrote:
From: Li RongQing <lirongqing@xxxxxxxxx>

Currently, when hung_task_panic is enabled, kernel will panic
immediately upon detecting the first hung task. However, some hung
tasks are transient and the system can recover fully, while others are
unrecoverable and trigger consecutive hung task reports, and a panic is
expected.

This commit adds a new sysctl parameter hung_task_count_to_panic to
allows specifying the number of consecutive hung tasks that must be
detected before triggering a kernel panic. This provides finer control
for environments where transient hangs maybe happen but persistent
hangs should still be fatal.

Acked-by: Lance Yang <lance.yang@xxxxxxxxx>
Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
---

It's working as expect. So:
Tested-by: Lance Yang <lance.yang@xxxxxxxxx>

But on second thought: regarding this new sysctl parameter, I was wondering if
a name like max_hung_task_count_to_panic might be a bit more explicit, just to
follow the convention from max_rcu_stall_to_panic.


I see that all the hung task sysctl parameters start with "hung_task"? Should we keep this convention? If so, we could name it "hung_task_max_to_panic". If not, we could call it "max_hang_task_to_panic"?

Well, let's see what other folks think ;)