Re: [Patch] kexec: remove KMSG_DUMP_KEXEC (was Re: Query about kdump_msghook into crash_kexec())

From: KOSAKI Motohiro
Date: Thu Jun 09 2011 - 07:16:01 EST


Hi

> Seiji Aguchi <seiji.aguchi@xxxxxxx> writes:
>
>> Hi,
>>
>>> What are you using kmsg_dump() for? Using mtdoops, ramoops or something
>>> else? Is it working reliably for you?
>>
>> I plan to use kmsg_dump() for set_variable service of UEFI.
>> I proposed a prototype patch this month and will improve it.
>> (kmsg_dump is used inside pstore.)
>>
>> https://lkml.org/lkml/2011/5/10/340
>
> Shudder. Firmware calls in the crash path.
>
> If that is the use, we need to remove the kmsg_dump(KMSG_DUMP_KEXEC)
> hook from crash_kexec yesterday. It is leading to some really ludicrous
> suggestions that are on the way from making kexec on panic unreliable
> and useless.

Do you have concrete example?

If you only talked about theoretical issue, probably making boot parameter
is enough and reasonable way.


> There will always be EFI implementations where that will not work and
> there will be no way we can fix those.
>
> There is a long history of people trying to do things in a crashing
> kernel, things that simply do not work when the system is in a bad
> state. kmsg_dump() when I reviewed the code had significant
> implementation problems for being called from interrupt handlers
> and the like.
>
> To introduce a different solution for capturing information when a
> kernel crashes we need to see numbers that in a large number of
> situations that the mechanism you are proposing is more reliable and/or
> more maintainable than the current kexec on panic implementation.
>
> The best work I know of on the reliability of the current situation
> is "Evaluating Linux Kernel Crash Dumping Mechanisms", by Fernando Luis Vazquez Cao.
> http://www.linuxsymposium.org/archives/OLS/Reprints-2006/cao-reprint.pdf

Every reliability improvement idea is welcome! This also improve embedded too.


> Now it does happen to be a fact that our efi support in linux is
> so buggy kexec does not work let alone kexec on panic (if the target
> kernel has any efi support). But our efi support being buggy is not
> a reason to add more ways to fail when we have a kernel with efi
> support. It is an argument to remove our excessive use of EFI
> calls.

Which part is buggy? As far as I know, IBM, HP and Fujitsu have EFI supported
server product and it works.

So, if you are suffered from buggy efi, I think the best way is to make config
option and you disable it and Aguch-san enable it.


> So let's just remove the ridiculous kmsg_dump(KMSG_DUMP_KEXEC) hook from
> crash_kexec and remove any temptation for abuses like wanting to use
> kmsg_dump() on anything but a deeply embedded system where there simply
> is not enough memory for 2 kernels.

This sentence has two misunderstands. 1) modern embedded (e.g. Android) has
lots memory rather than 10 years past PC 2) they often don't need full feature
second kernel. they often don't need full crash dump.

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/