Re: dump device, reboot on kernel panic

Stephen C. Tweedie (sct@redhat.com)
Mon, 9 Nov 1998 23:18:13 GMT


Hi,

On Fri, 06 Nov 1998 19:28:57 -0800, John Daley
<johnd@virtual-impact.com> said:

> Our company supports a bunch of cheapo pentium boxes scattered
> about the coutry running linux. They are maintained remotely
> through either ppp or idsn connetions. This works great except for
> the rare kernel panic that leaves the machine in a state where we
> can't dial into it. ...

> I guess hardware watchdog boards would be an option too, but haven't
> looked into that.

You should do: sounds like it would be a useful addition.

> Some ideas:

> In order to make a linux be 100% maintainable remotely (until a hard
> failure occurs of course), what do ya'll think about these ideas
> (maybe they are already available?):

> (1) Have a 'reboot on panic' capability. Optionally booting into
> another partition. Switching between the modes through /proc er so.

Already done:

echo N > /proc/sys/kernel/panic

to set the timout (in seconds) from panic until the reboot.

> (2) Have the ability to write a system image to a pre-defined dump
> partition when the kernel panics.

Yes, would be nice.

> (3) Journaled filesystem

I'm working on it: hopefully a we'll have a prototype as a Christmas
present. :)

Finally, investigate using a serial console. That is built into 2.1
(read Documentation/serial-console.txt), but you can achieve similar
functionality on 2.0 by #defining CONFIG_SERIAL_ECHO and editing the
serial port address in linux/drivers/char/console.c. Then make the
/dev/console device point to /dev/ttyS0 and you are all set. I do this
on all of my boxes.

> And now for a real kernel problem....:

> I have a very old kernel (2.0.18) running redhat 4.0 that crashes
> sometimes. First off, can I upgrade kernel version 2.0.35 with
> updating libc or anything?

Yes.

> I can't reproduce the problem (see background above). The crash is at
> get_empty_inode+68/324.

Lots of stuff in there has been fixed; in particular at least a couple
of nasty races involving clearing inodes have been fixed since 2.0.18.

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/