Re: [PATCH] panic: add option to dump blocked tasks in panic_print

From: Guilherme G. Piccoli
Date: Sat Feb 03 2024 - 07:05:43 EST


On 02/02/2024 10:20, Feng Tang wrote:
> For debugging kernel panic and other bugs, there is already option of
> panic_print to dump all tasks' call stacks. On today's large servers
> running many containers, there could be thousands of tasks or more,
> and it will print out huge amount of call stacks, and take a lot of
> time (for serial console which is main target user case of panic_print).
>
> And in many cases, only those several tasks being blocked is key for
> the panic, so add an option to only dump blocked tasks' call stack.
>
> Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
> [...]

Thank you Feng Tang, this is an interesting and useful idea!
I've just tested the patch and works fine - also no code issues from my
side. So, feel free to add:


Tested-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>


Cheers!

---
> Documentation/admin-guide/kernel-parameters.txt | 1 +
> Documentation/admin-guide/sysctl/kernel.rst | 1 +
> kernel/panic.c | 4 ++++
> 3 files changed, 6 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 31b3a25680d0..0f2369e87175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4182,6 +4182,7 @@
> bit 4: print ftrace buffer
> bit 5: print all printk messages in buffer
> bit 6: print all CPUs backtrace (if available in the arch)
> + bit 7: print tasks in uninterruptible (blocked) state
> *Be aware* that this option may print a _lot_ of lines,
> so there are risks of losing older messages in the log.
> Use this option carefully, maybe worth to setup a
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 6584a1f9bfe3..e066a16b35d5 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -850,6 +850,7 @@ bit 3 print locks info if ``CONFIG_LOCKDEP`` is on
> bit 4 print ftrace buffer
> bit 5 print all printk messages in buffer
> bit 6 print all CPUs backtrace (if available in the arch)
> +bit 7 print tasks in uninterruptible (blocked) state
> ===== ============================================
>
> So for example to print tasks and memory info on panic, user can::
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 2807639aab51..aa17ae0897c0 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -73,6 +73,7 @@ EXPORT_SYMBOL_GPL(panic_timeout);
> #define PANIC_PRINT_FTRACE_INFO 0x00000010
> #define PANIC_PRINT_ALL_PRINTK_MSG 0x00000020
> #define PANIC_PRINT_ALL_CPU_BT 0x00000040
> +#define PANIC_PRINT_BLOCKED_TASKS 0x00000080
> unsigned long panic_print;
>
> ATOMIC_NOTIFIER_HEAD(panic_notifier_list);
> @@ -227,6 +228,9 @@ static void panic_print_sys_info(bool console_flush)
>
> if (panic_print & PANIC_PRINT_FTRACE_INFO)
> ftrace_dump(DUMP_ALL);
> +
> + if (panic_print & PANIC_PRINT_BLOCKED_TASKS)
> + show_state_filter(TASK_UNINTERRUPTIBLE);
> }
>
> void check_panic_on_warn(const char *origin)