Re: perf hw in kexeced kernel broken in tip
From: Eric W. Biederman
Date: Wed Dec 08 2010 - 16:17:05 EST
Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
> On Wed, 2010-12-08 at 10:02 -0500, Vivek Goyal wrote:
>
>> >but its kdump so its mostly broken by design anyway ;-)
>>
>> Kdump has its share of problems especially with the fact that
>> kernel/drivers find devices in bad state and are not hardened enough
>> to deal with that. But on bare metal what's the better way of capturing
>> kernel crash dump? Trying to do anything post crash in the kernel is
>> also not very reliable either.
>
> /me <3 RS-232
>
> I haven't found anything better than that...
True. But it can be a pain to operate RS-232 at production scale, or to
convince customers to hook up RS-232 just in case your released software
happens to crash.
> And poking at the RS-232 requires less of the kernel to be functional
> than booting into a new kernel (whose image might have been corrupted by
> the dying kernel, etc..)
For debugging a reproducible failure RS-232 wins. For everything else
there is kdump. It sucks but it is at least fixable.
And really the kdump kernel should be running a minimalistic hardware
config so you only have to get the chunks of hardware you really care
about working.
As for corruption the kdump kernel lives in an area of memory that we
never DMA to in the primary kernel, and we check a sha256 hash before we
start booting the kdump kernel. In general kdump fails safe. That is if
it can't makes things work it fails to boot and does nothing to your
system. Definitely not perfect but if you don't have RS-232 it is the
best I have seen.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/