Re: BUG: soft lockup detected on Phenom with Debian 2.6.24-4
From: Laurent GUERBY
Date: Sat Apr 12 2008 - 03:28:38 EST
Hi,
FYI with Peter off-list help we found a way to make the ASUS M2A-VM with
1604 BIOS stable under my stress test: we just needed nmi_watchdog=1 in
the kernel boot options (no other boot option necessary).
With nmi_watchdog=1 we see in kern.log "APIC error" but
the machine stayed stable during 3 days of stress testing:
...
Apr 7 22:41:43 gcc04 kernel: APIC error on CPU2: 00(40)
Apr 7 22:41:43 gcc04 kernel: APIC error on CPU1: 00(40)
Apr 7 22:41:43 gcc04 kernel: APIC error on CPU3: 00(40)
Apr 7 22:41:43 gcc04 kernel: APIC error on CPU0: 00(40)
Apr 7 22:53:01 gcc04 kernel: APIC error on CPU3: 40(40)
Apr 7 22:53:01 gcc04 kernel: APIC error on CPU0: 40(40)
Apr 7 22:53:01 gcc04 kernel: APIC error on CPU1: 40(40)
...
guerby@gcc04:~$ cat /proc/cmdline
root=/dev/sda1 ro nmi_watchdog=1
We are now stress testing the 1705 BIOS version which was released by
ASUS on 20080331, with and without nmi_watchdog=1. Then we'll go
back to testing the ASUS M3A32-MVP Deluxe/WiFi-AP with the newer 1002
BIOS also released on 20080331.
Note: for msr decoding xxd should be used since hexdump doesn't work:
xxd -s 0xc0010015 -l 8 /dev/cpu/0/msr
So people having stability problems with Phenom 9x00 with Linux should
try nmi_watchdog=1 as boot option.
Sincerely,
Laurent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/