On 12/08/06, Jesse Brandeburg <jesse.brandeburg@xxxxxxxxx> wrote:One other thing, putting a watchdog card in the box to automatically
> On 8/10/06, Om N. <xhandle@xxxxxxxxx> wrote:
> > (I do not have a remote power on/off switch. The driver panics so
> > often that somebody has to babysit the machine to switch it off and
> > on. We are in different time zones and things are not moving forward at all)
>
> two (or three) things I've done to help this, when I'm working remotely
>
> add panic=30 to your kernel options in grub (or echo 30 >
> /proc/sys/kernel/panic) to reboot the system automatically on panic.
>
> set grub to automatically boot the safe kernel by default, and when
> making a new kernel, set grub to boot it only once with (say your
> default is 0 and the new kernel is 1 in grub)
>
> echo 'savedefault --default=1 --once' | grub --batch
>
> set up netconsole so that you can see the kernel messages (optional) on oops.
>
> finding out about all these was incredibly hard and obtuse :-) So
> hope this helps.
If you have remote access to the keyboard/display etc, for example via
a network enabled KVM switch or similar, then magic sysrq can also be
very helpful to remotely sync & unmount filesystems, and do emergency
reboots... It can also be used to get stack dumps etc... Read
Documentation/sysrq.txt for all the details.
I'm not sure, but I think there was a patch floating around at one
point that would let you trigger sysrq remotely over the network
without even logging into the box or having remote keyboard access -
could be very useful in your case although it is ofcourse extremely
insecure.