NMI errors with 2.6.13 (and 2.6.12.x)

From: Dimitri Puzin
Date: Fri Sep 02 2005 - 12:53:35 EST


Hello,

I've got a Tyan S5112 (Intel E7210 Chipset) and 2x256 MB ECC RAM. The
CPU is a P-IV 3.0GHz HT. I run the current vanilla kernel on
Debian/Sarge. The config is attached. The kernel is booted with
nmi_watchdog=1. That box logs random oopses [0] to the syslog but
doesn't freeze. I've already tried different configs with and without
ACPI, SMP, HT etc, but no useful results. When ran without nmi_watchdog
the /proc/interrupts gets filled with ERR interrupts. The call trace is
every time different. The same appeared with 2.6.12.x series when I was
figuring what's running bad. The current release shows the same. Could
anyone give some advice on that?

Kind regards
-Dimitri aka Tristan-777

[0]: snippet from syslog, with timestamp and hostname etc stripped
NMI: IOCK error (debug interrupt?)
Modules linked in: as_iosched ipv6 nfsd lockd nfs_acl sunrpc lp thermal
fan button processor ac battery tun crc32 pppoe pppox af_packet
ppp_generic slhc ipt_MASQUERADE ipt_REDIRECT ipt_REJECT ipt_ULOG
ipt_state iptable_filter parport_pc parport 8250_pnp 8250 serial_core
evdev floppy pcspkr prism54 firmware_class sg st sr_mod cdrom sym53c8xx
scsi_transport_spi scsi_mod e1000 i2c_i801 ehci_hcd uhci_hcd usbcore
intel_agp agpgart xfs exportfs dm_mod ip_nat_ftp iptable_nat ip_tables
ip_conntrack_ftp ip_conntrack non_fatal eeprom lm85 w83627hf i2c_sensor
i2c_isa i2c_core raid5 xor md_mod rtc ext3 jbd mbcache ide_disk piix
pdc202xx_new ide_core unix
CPU: 0
EIP: 0060:[mwait_idle+51/80] Not tainted VLI
EFLAGS: 00000246 (2.6.13)
EIP is at mwait_idle+0x33/0x50
eax: 00000000 ebx: c0308008 ecx: 00000000 edx: 00000000
esi: c0308000 edi: deb63000 ebp: c0331300 esp: c0309fb0
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0308000 task=c028dbc0)
Stack: ffffe000 deb63074 e0c29b75 c14042e0 c14042e0 ffffe000 c0308000
c0331380
c0331300 c0100d80 00020800 0009e700 c0337120 00385007 c030a907
00000016
c030a360 00000000 c0338c80 c010020f
Call Trace:
[pg0+546069365/1070179328] acpi_processor_idle+0x106/0x2ae [processor]
[cpu_idle+112/128] cpu_idle+0x70/0x80
[start_kernel+375/448] start_kernel+0x177/0x1c0
[unknown_bootoption+0/448] unknown_bootoption+0x0/0x1c0
Code: 21 e2 8b 42 08 a8 08 75 36 f0 0f ba 6a 08 10 31 c9 89 d6 8d 5a 08
89 f6 89 d8 89 ca 0f 01 c8 8b 46 08 a8 08 75 0c 89 c8 0f 01 c9 <8b> 46
08 a8 08 74 e6 b8 00 e0 ff ff 21 e0 f0 0f ba 70 08 10 5b

Attachment: config-2.6.13.gz
Description: GNU Zip compressed data