Re: Oops assist...

Keith Owens (kaos@ocs.com.au)
Thu, 06 May 1999 11:14:12 +1000


On Wed, 05 May 1999 13:42:11 -0400,
Horst von Brand <vonbrand@inf.utfsm.cl> wrote:
>Manfred Spraul <manfreds@colorfullife.com> said:
>> Alan Cox wrote:
>> > Even more ideal would be to dump the oops to somewhere non volatile over
>> > a reboot. I'm not sure there are good candidates for this on a PC.
>
>> What about the swap file?
>
>I don't trust the kernel enough to fiddle around with disks when it has
>just crashed...

Completely off the wall suggestion (i.e. I don't know enough to say if
it will work :). Using the kernel to write to disk is not safe after
an Oops. If the kernel structures are corrupt then you run a big risk
of corrupting the disk. Even if the kernel structures are valid, locks
and/or bh scheduling can easily stop all I/O.

What we need is a safe and guaranteed to execute method of writing to
disk. Without a trustworthy kernel, the only other option for disk I/O
is the BIOS. So is it possible to have some kernel code that switches
into real mode, uses the BIOS to dump information to a dedicated disk
partition then reboots? kernel/traps.c dumps its info then branches to
the dump code.

Obvious downsides - requires a dedicated disk partion below cylinder
1024, extra kernel code, a chunk of real code memory is dedicated to
the dump code and (the big one) all Oops would result in a reboot.
That last requirement is because the dump code has no way of telling if
the kernel is usable after an Oops so we have to assume worst case.
Obviously this entire procedure would be a compile time option or
module, not everybody wants to reboot after every Oops.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/