Re: [PATCH] kernel/panic: Add "late_kdump" option for kdump in unstable condition

From: Masami Hiramatsu
Date: Wed Apr 16 2014 - 22:00:16 EST


Thank you for review!

(2014/04/16 22:48), Vivek Goyal wrote:
> On Mon, Apr 14, 2014 at 01:51:58PM +0900, Masami Hiramatsu wrote:
>> Add a "late_kdump" option to run kdump after running panic
>> notifiers and dump kmsg. This can help rare situations which
>> kdump drops in failure because of unstable crashed kernel
>> or hardware failure (memory corruption on critical data/code),
>> or the 2nd kernel is broken by the 1st kernel (it's a broken
>> behavior, but who can guarantee that the "crashed" kernel
>> works correctly?).
>>
>> Usage: add "late_kdump" to kernel boot option. That's all.
>>
>> Note that this actually increases risks of the failure of
>> kdump. This option should be set only if you worry about
>> the rare case of kdump failure rather than increasing the
>> chance of success.
>>
>> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@xxxxxxxxxxx>
>> Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
>> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@xxxxxxxxxxx>
>> Cc: Satoru MORIYA <satoru.moriya.br@xxxxxxxxxxx>
>> Cc: Motohiro Kosaki <Motohiro.Kosaki@xxxxxxxxxxxxxx>
>> Cc: Takenori Nagano <t-nagano@xxxxxxxxxxxxx>
>> ---
>> Documentation/kernel-parameters.txt | 7 +++++++
>> kernel/panic.c | 24 ++++++++++++++++++++++--
>> 2 files changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index 03e50b4..1ba58da 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -2339,6 +2339,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>> timeout < 0: reboot immediately
>> Format: <timeout>
>>
>> + late_kdump Run kdump after running panic-notifiers and dumping
>> + kmsg. This only for the users who doubt kdump always
>> + succeeds in any situation.
>> + Note that this also increases risks of kdump failure,
>> + because some panic notifiers can make the crashed
>> + kernel more unstable.
>> +
>
> I am wondering if "crash_kexec_post_notifiers" will be a better name
> to represent what we are trying to do here.

OK, I'll rename that.

>
>> parkbd.port= [HW] Parallel port number the keyboard adapter is
>> connected to, default is 0.
>> Format: <parport#>
>> diff --git a/kernel/panic.c b/kernel/panic.c
>> index d02fa9f..bba42b5 100644
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -32,6 +32,7 @@ static unsigned long tainted_mask;
>> static int pause_on_oops;
>> static int pause_on_oops_flag;
>> static DEFINE_SPINLOCK(pause_on_oops_lock);
>> +static bool late_kdump;
>>
>> int panic_timeout = CONFIG_PANIC_TIMEOUT;
>> EXPORT_SYMBOL_GPL(panic_timeout);
>> @@ -112,9 +113,14 @@ void panic(const char *fmt, ...)
>> /*
>> * If we have crashed and we have a crash kernel loaded let it handle
>> * everything else.
>> - * Do we want to call this before we try to display a message?
>> + * If we want to call this after we try to display a message, pass
>> + * the "late_kdump" option to the kernel.
>> */
>> - crash_kexec(NULL);
>> + if (!late_kdump)
>> + crash_kexec(NULL);
>> + else
>> + pr_emerg("Warning: late_kdump option is set. Please DO NOT "
>> + "report bugs about kdump failure with this option.\n");
>
> I think above message about DO NOT report bugs seems unnecessary.

OK, so I just notify the option is set as below.
"Warning: crash_kexec_post_notifiers is set.\n"

Thank you again!

--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@xxxxxxxxxxx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/