RE: [RFC][Patch] Adding kmsg_dump() toreboot/halt/poweroff/emergency_restart path

From: Seiji Aguchi
Date: Wed Oct 27 2010 - 15:53:08 EST


Hi,

> What actual problem are we solving here? Why is the current code
> inadequate? It would help to demonstrate some use-case and to explain
> how the situation improved with this patch.

[Purpose]
My purpose is developing highly reliable logging facility for enterprise use.

I'm planning to add the following triggers of kmsg_dumper().
- reboot/poweroff/halt/emergency_restart (this patch)
- Machine check

I'm also planning to add an feature outputting kernel messages to NVRAM,
because NVRAM is equipped with enterprise servers.
We can realize highly reliable logging facility by outputting kernel messages to NVRAM.
(NVRAM is commonly used on Mainframe and Commercial Unix as well.)

[Use case of reboot/poweroff/halt/emergency_restart]

My company has often experienced the followings in our support service.
- Customer's system suddenly reboots.
- Customers ask us to investigate the reason of the reboot.

We recognize the fact itself because boot messages remain in /var/log/messages.
However, we can't investigate the reason why the system rebooted,
because the last messages don't remain.
And off course we can't explain the reason.


We can solve above problem with this patch as follows.
Case1: reboot with command
- We can see "Restarting system with command:" or ""Restarting system.".

Case2: halt with command
- We can see "System halted.".

Case3: poweroff with command
- We can see " Power down.".

Case4: emergency_restart with sysrq.
- We can see "Sysrq:" outputted in __handle_sysrq().

Case5: emergency_restart with softdog.
- We can see "Initiating system reboot" in watchdog_fire().

So, we can distinguish the reason of reboot, poweroff, halt and emergency_restart.

If customer executed reboot command, you may think the customer should know the fact.
However, they often claim they don't execute the command when they rebooted system by mistake.

No evidential message remain on current Linux kernel, so we can't show the proof to the customer.
This patch improves this situation.

Seiji
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/