Re: [PATCH v2] panic: Add options to print system info when panic happens

From: Steven Rostedt
Date: Tue Nov 27 2018 - 13:23:25 EST


On Tue, 27 Nov 2018 15:15:20 +0800
Feng Tang <feng.tang@xxxxxxxxx> wrote:

> Kernel panic issues are always painful to debug, partially
> because it's not easy to get enough information of the
> context when panic happens.
>
> And we have ramoops and kdump for that, while this commit
> tries to provide a easier way to show the system info by adding
> a cmdline parameter, referring some idea from sysrq handler.
>
> Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: John Stultz <john.stultz@xxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> ---
> Changelog:
> v2:
> - change text "dump/DUMP" to "print/PRINT" which
> is more accurate, suggested by Andrew Morton
> - add code to print ftrace buffer
>
> Documentation/admin-guide/kernel-parameters.txt | 8 +++++++
> kernel/panic.c | 28 +++++++++++++++++++++++++
> 2 files changed, 36 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 19f4423..80c819a 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3081,6 +3081,14 @@
> timeout < 0: reboot immediately
> Format: <timeout>
>
> + panic_print= Bitmask for printing system info when panic happens.
> + User can chose combination of the following bits:
> + bit 0: print all tasks info
> + bit 1: print system memory info
> + bit 2: print timer info
> + bit 3: print locks info if CONFIG_LOCKDEP is on

> + bit 4: print ftrace buffer

Note, "ftrace_dump_on_oops" accomplishes the same thing.

Should this be a sysctl setting as well?

-- Steve

> +
> panic_on_warn panic() instead of WARN(). Useful to cause kdump
> on a WARN().
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index f6d549a..fb6ccd1 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -45,6 +45,13 @@ int panic_on_warn __read_mostly;
> int panic_timeout = CONFIG_PANIC_TIMEOUT;
> EXPORT_SYMBOL_GPL(panic_timeout);
>
> +#define PANIC_PRINT_TASK_INFO 0x00000001
> +#define PANIC_PRINT_MEM_INFO 0x00000002
> +#define PANIC_PRINT_TIMER_INFO 0x00000004
> +#define PANIC_PRINT_LOCK_INFO 0x00000008
> +#define PANIC_PRINT_FTRACE_INFO 0x00000010
> +static unsigned long panic_print;
> +
> ATOMIC_NOTIFIER_HEAD(panic_notifier_list);
>
> EXPORT_SYMBOL(panic_notifier_list);
> @@ -124,6 +131,24 @@ void nmi_panic(struct pt_regs *regs, const char *msg)
> }
> EXPORT_SYMBOL(nmi_panic);
>
> +static void panic_print_sys_info(void)
> +{
> + if (panic_print & PANIC_PRINT_TASK_INFO)
> + show_state();
> +
> + if (panic_print & PANIC_PRINT_MEM_INFO)
> + show_mem(0, NULL);
> +
> + if (panic_print & PANIC_PRINT_TIMER_INFO)
> + sysrq_timer_list_show();
> +
> + if (panic_print & PANIC_PRINT_LOCK_INFO)
> + debug_show_all_locks();
> +
> + if (panic_print & PANIC_PRINT_FTRACE_INFO)
> + ftrace_dump(DUMP_ALL);
> +}
> +
> /**
> * panic - halt the system
> * @fmt: The text string to print
> @@ -250,6 +275,8 @@ void panic(const char *fmt, ...)
> debug_locks_off();
> console_flush_on_panic();
>
> + panic_print_sys_info();
> +
> if (!panic_blink)
> panic_blink = no_blink;
>
> @@ -654,6 +681,7 @@ void refcount_error_report(struct pt_regs *regs, const char *err)
> #endif
>
> core_param(panic, panic_timeout, int, 0644);
> +core_param(panic_print, panic_print, ulong, 0644);
> core_param(pause_on_oops, pause_on_oops, int, 0644);
> core_param(panic_on_warn, panic_on_warn, int, 0644);
> core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);