latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()

From: Vegard Nossum
Date: Tue Aug 19 2008 - 15:51:55 EST


Hi,

With latest -git (1fca25427482387689fa27594c992a961d98768f), I got
this on reading from /dev/cpu/*/* while hot-unplugging cpu1.

------------[ cut here ]------------
WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/arch/x86/kernel/ipi.c:123
send_IPI_mask_bitmask+0xc3/0xe0()
Pid: 3881, comm: cat Not tainted 2.6.27-rc3-00464-g1fca254 #12
[<c013591f>] warn_on_slowpath+0x4f/0x80
[<c010a300>] ? native_sched_clock+0x80/0x110
[<c010a335>] ? native_sched_clock+0xb5/0x110
[<c015ae5a>] ? __lock_acquire+0x27a/0xa00
[<c015635b>] ? trace_hardirqs_off+0xb/0x10
[<c010a335>] ? native_sched_clock+0xb5/0x110
[<c01563bd>] ? put_lock_stats+0xd/0x30
[<c0118a43>] send_IPI_mask_bitmask+0xc3/0xe0
[<c01017c8>] send_IPI_mask+0x8/0x10
[<c0118307>] native_send_call_func_single_ipi+0x27/0x30
[<c0160a2b>] generic_exec_single+0x7b/0x80
[<c0160adf>] smp_call_function_single+0x5f/0x110
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a597>] _rdmsr_on_cpu+0x27/0x60
[<c037a5ea>] rdmsr_safe_on_cpu+0x1a/0x20
[<c011733e>] msr_read+0x6e/0xa0
[<c01a87b4>] vfs_read+0x94/0x130
[<c01172d0>] ? msr_read+0x0/0xa0
[<c01a8b5d>] sys_read+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================
---[ end trace fe4338948cb73be2 ]---
BUG: soft lockup - CPU#0 stuck for 61s! [cat:3881]
irq event stamp: 14632440
hardirqs last enabled at (14632439): [<c015968b>] trace_hardirqs_on+0xb/0x10
hardirqs last disabled at (14632440): [<c015635b>] trace_hardirqs_off+0xb/0x10
softirqs last enabled at (14632434): [<c013a4d1>] __do_softirq+0xe1/0x100
softirqs last disabled at (14632427): [<c013a595>] do_softirq+0xa5/0xb0
Pid: 3881, comm: cat Tainted: G W (2.6.27-rc3-00464-g1fca254 #12)
EIP: 0060:[<c0160952>] EFLAGS: 00200202 CPU: 0
EIP is at csd_flag_wait+0x12/0x20
EAX: f5f31ef0 EBX: c215dc60 ECX: ffffb300 EDX: 000008fa
ESI: 00200292 EDI: c215dc68 EBP: f5f31ec0 ESP: f5f31ec0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: 087d0a5c CR3: 33c36000 CR4: 000006d0
DR0: c0ebd43c DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
[<c0160a15>] generic_exec_single+0x65/0x80
[<c0160adf>] smp_call_function_single+0x5f/0x110
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a440>] ? __rdmsr_safe_on_cpu+0x0/0x60
[<c037a597>] _rdmsr_on_cpu+0x27/0x60
[<c037a5ea>] rdmsr_safe_on_cpu+0x1a/0x20
[<c011733e>] msr_read+0x6e/0xa0
[<c01a87b4>] vfs_read+0x94/0x130
[<c01172d0>] ? msr_read+0x0/0xa0
[<c01a8b5d>] sys_read+0x3d/0x70
[<c01040db>] sysenter_do_call+0x12/0x3f
=======================

At least SSH is not usable after this, but I guess SysRq and such
would work (the "CPU stuck" message still showed up after the apparent
freeze).


Vegard

PS: This is probably not a regression.

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/