Re: NMI watchdog question

From: Robert Hancock
Date: Sun Mar 06 2005 - 10:16:58 EST


Pallai Roland wrote:
Hi,
I'm playing with the NMI watchdog (nmi_watchdog=1) on a reproductable
hard lockup (no keyboard, etc) but seems like it doesn't works and I
can't understand why, please explain to me the possible causes.. I
belive it should work in this situation..

environment:
P4C800 motherboard, P4-2.4 cpu (APIC 2.0 on)
Promise 20378 SATA controller on the motherboard (sil_promise driver)
Maxtor diamondmax plus 9 200G sata disk
(and an empty PCI expander plus some more other under-testing hardware
which doesn't matter in the experiment)

mainline kernel 2.6.11
serial and VGA console, root on NFS


steps to the lockup:
1. booting the machine with sata drive on the promise controller
2. dd if=/dev/sda of=/dev/null bs=4k
3. unplug the power from drive
4. waiting about 2 seconds
5. plug the power back

dd stucked in 'D' here for 10-15 seconds and than the kernel say:
ata1: command timeout

and voila, the box is dead, but without any message from the NMI
watchdog :(

The NMI watchdog only triggers if something is blocking interrupts from getting through - if timer interrupts are still happening it won't activate. You can try Alt-Sysrq-T to get a traceback of where the current process is stuck..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/