NMI Watchdog detected LOCKUP
From: Jesper Juhl
Date: Thu Dec 28 2006 - 04:11:40 EST
Hi,
One of my NFS servers just blew up on me. I found the following in the log - hope it is useful :
NMI Watchdog detected LOCKUP on CPU 1
CPU 1
Modules linked in: sg eeprom i2c_i801
Pid: 284, comm: scsi_eh_2 Not tainted 2.6.18.1 #1
RIP: 0010:[<ffffffff80375e6a>] [<ffffffff80375e6a>] __delay+0xa/0x20
RSP: 0000:ffff81022fbf5d80 EFLAGS: 00000097
RAX: 000000000020bb08 RBX: 000000000000d6db RCX: 0000000057489868
RDX: 0000000000192501 RSI: ffff8100cff93528 RDI: 000000000030d2ac
RBP: ffff81022fbf5d80 R08: ffff81022fbf4000 R09: 0000000000000000
R10: ffff8100cff93528 R11: 00000000000000d8 R12: ffff81022f449ce8
R13: 0000000000000001 R14: 0000000000000004 R15: ffff81022f449878
FS: 0000000000000000(0000) GS:ffff8101fef36640(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000f7f59000 CR3: 000000022f590000 CR4: 00000000000006e0
Process scsi_eh_2 (pid: 284, threadinfo ffff81022fbf4000, task ffff81022fa32180)
Stack: ffff81022fbf5d90 ffffffff80375eaa ffff81022fbf5dc0 ffffffff803ee1bc
ffff8100cff93528 ffff81022f449ce8 ffff81022f449800 ffff81022f449d40
ffff81022fbf5e00 ffffffff803f175f ffff81022fbf5de0 ffffffff80371529
Call Trace:
[<ffffffff80375eaa>] __const_udelay+0x2a/0x30
[<ffffffff803ee1bc>] ips_send_wait+0xbc/0xe0
[<ffffffff803f175f>] __ips_eh_reset+0x11f/0x3f0
[<ffffffff803f1a53>] ips_eh_reset+0x23/0x40
[<ffffffff803e5034>] scsi_try_host_reset+0x34/0xb0
[<ffffffff803e5eb6>] scsi_error_handler+0x476/0x6d0
[<ffffffff80241bfb>] kthread+0xdb/0x120
[<ffffffff8020ab1c>] child_rip+0xa/0x12
Code: 0f 31 29 c8 48 39 f8 72 f5 c9 c3 66 66 66 90 66 66 66 90 66
console shuts up ...
NMI Watchdog detected LOCKUP on CPU 0
CPU 0
Modules linked in: sg eeprom i2c_i801
Pid: 15, comm: kblockd/0 Not tainted 2.6.18.1 #1
RIP: 0010:[<ffffffff804aca3b>] [<ffffffff804aca3b>] .text.lock.spinlock+0x2/0x97
RSP: 0000:ffff81022fbbfd70 EFLAGS: 00000086
RAX: ffff81022fa22228 RBX: ffff8100a46be4a8 RCX: 0000000000000000
RDX: ffff8100a46be7d8 RSI: ffff8100a46be4a8 RDI: ffff81022f449850
RBP: ffff81022fbbfd70 R08: ffff81022fbbe000 R09: 0000000000000003
R10: 0000000000000002 R11: 0000000000000000 R12: ffff81022fae6800
R13: ffff81022f449800 R14: ffff81022fa22048 R15: ffff8101f304e980
FS: 0000000000000000(0000) GS:ffffffff80683000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000000805bc2c CR3: 000000022b911000 CR4: 00000000000006e0
Process kblockd/0 (pid: 15, threadinfo ffff81022fbbe000, task ffff81022faeaf00)
Stack: ffff81022fbbfdc0 ffffffff803e791f ffff81022fae6848 ffff81022fae6848
ffff81022fae68e8 ffff81022fa22048 ffff81022fa22170 ffff8101fef362c0
ffff81022fa22048 0000000000000282 ffff81022fbbfde0 ffffffff803632f5
Call Trace:
Inexact backtrace:
[<ffffffff803e791f>] scsi_request_fn+0x19f/0x350
[<ffffffff803632f5>] __generic_unplug_device+0x25/0x30
[<ffffffff80363320>] generic_unplug_device+0x20/0x40
[<ffffffff8036336d>] blk_unplug_work+0xd/0x10
[<ffffffff8023e415>] run_workqueue+0xb5/0x110
[<ffffffff80363360>] blk_unplug_work+0x0/0x10
[<ffffffff8023e5a1>] worker_thread+0x131/0x170
[<ffffffff80227660>] default_wake_function+0x0/0x10
[<ffffffff80227660>] default_wake_function+0x0/0x10
[<ffffffff8023e470>] worker_thread+0x0/0x170
[<ffffffff80241bfb>] kthread+0xdb/0x120
[<ffffffff8020ab1c>] child_rip+0xa/0x12
[<ffffffff80241b20>] kthread+0x0/0x120
[<ffffffff8020ab12>] child_rip+0x0/0x12
Code: 83 3f 00 7e f9 e9 af fc ff ff f3 90 83 3f 00 7e f9 e9 d6 fc
console shuts up ...
The server is running 2.6.18.1
# uname -a
Linux st1.surf-town.net 2.6.18.1 #1 SMP Wed Oct 18 12:45:37 CEST 2006 x86_64 GNU/Linux
The kernel is 64bit but userspace is 32bit.
# scripts/ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
Linux st1.surf-town.net 2.6.18.1 #1 SMP Wed Oct 18 12:45:37 CEST 2006 x86_64 GNU/Linux
Gnu C 3.3.5
Gnu make 3.80
binutils 2.15
util-linux 2.12p
mount 2.12p
module-init-tools 3.2-pre1
e2fsprogs 1.37
xfsprogs 2.6.28
quota-tools 3.12.
nfs-utils 1.0.6
Linux C Library 2.3.2
Dynamic linker (ldd) 2.3.2
Procps 3.2.1
Net-tools 1.60
Console-tools 0.2.3
Sh-utils 5.2.1
udev 056
Modules Loaded sg eeprom i2c_i801
# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.193
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6413.18
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.20GHz
stepping : 3
cpu MHz : 3200.193
cache size : 2048 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr
bogomips : 6400.68
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:
# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 15 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Processor ANSI SCSI revision: 02
Host: scsi0 Channel: 02 Id: 08 Lun: 00
Vendor: IBM Model: 39M6750a S320 0 Rev: 1
Type: Processor ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 00 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 01 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 02 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 03 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 15 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Processor ANSI SCSI revision: 02
Host: scsi1 Channel: 01 Id: 15 Lun: 00
Vendor: IBM Model: EXP400 S320 Rev: D110
Type: Processor ANSI SCSI revision: 03
Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 01 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 02 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 03 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 04 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 05 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 06 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 15 Lun: 00
Vendor: IBM Model: SERVERAID Rev: 1.00
Type: Processor ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 15 Lun: 00
Vendor: IBM Model: EXP400 S320 Rev: D110
Type: Processor ANSI SCSI revision: 03
--
Kind regards,
Jesper Juhl <jesper.juhl@xxxxxxxxx>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/