Re: NMI watchdog

From: John Sigler
Date: Fri Oct 12 2007 - 06:58:26 EST


Björn Steinbrink wrote:

John Sigler wrote:

I'm experiencing a full system lockup. I'm using an out-of-tree driver which I suspect is responsible. I'm trying to enable the NMI watchdog.

# cat /proc/version
Linux version 2.6.22.1-rt9 (gcc version 3.4.6) #1 PREEMPT RT Tue Oct 9 12:25:47 CEST 2007

# cat /proc/cmdline
ro root=/dev/hdc1 console=ttyS0,57600n8 console=tty0 panic=3 apic=debug nmi_watchdog=2

However, after boot, the NMI count does not change.

# cat /proc/interrupts ; sleep 10 ; cat /proc/interrupts

Try running some cpu hog in the background. The performance counters get
increased only when the CPU is actually doing something. On a mostly
idle system, it can take quite a while for the next NMI to show up.

You are right. In another shell, I ran while true; do : ; done

# cat /proc/interrupts ; sleep 10 ; cat /proc/interrupts
CPU0
0: 100 IO-APIC-edge timer
4: 82 IO-APIC-edge serial
8: 1 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
15: 13648 IO-APIC-edge ide1
16: 1303 IO-APIC-fasteoi eth0
17: 575 IO-APIC-fasteoi eth1
18: 575 IO-APIC-fasteoi eth2
19: 575 IO-APIC-fasteoi eth3
NMI: 2889
LOC: 115768
ERR: 0
MIS: 0

CPU0
0: 100 IO-APIC-edge timer
4: 82 IO-APIC-edge serial
8: 1 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
15: 13672 IO-APIC-edge ide1
16: 1310 IO-APIC-fasteoi eth0
17: 580 IO-APIC-fasteoi eth1
18: 580 IO-APIC-fasteoi eth2
19: 580 IO-APIC-fasteoi eth3
NMI: 2899
LOC: 116770
ERR: 0
MIS: 0

The performance counter appears to be configured to fire when the event count for CPU_CLK_UNHALTED reaches 2,400,000,000 (I have a 2.4 GHz CPU) i.e. one NMI per second when the CPU is 100% busy. Is that correct?

On a related note, I have a Pentium 3 which counts CPU_CLK_UNHALTED cycles even when the CPU is halted. I was told this is a bug, but it actually sounds like a a nice feature!

Is there really no way to have the event counter increment with every tick (even when the CPU is halted) on a P4?

Regards.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/