[BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

From: Masami Hiramatsu
Date: Fri Jul 10 2015 - 07:33:10 EST


Hi,

I've hit a kernel panic on Locking API testsuite on kvm-qemu.

The top commit is abf9b5f800eb13e53543ff284177efb538dc68fd.
It seems there is a bug in paravirt qspinlock implementation.

To make the configuration for reproducing this bug is very simple,
make allmodconfig && make localmodconfig.

Here is the kernel message which I've recovered from kernel logbuffer by using gdb.

-----
Locking API testsuite:
----------------------------------------------------------------------------
| spin |wlock |rlock |mutex | wsem | rsem |
--------------------------------------------------------------------------
A-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
double unlock:
------------[ cut here ]------------
kernel BUG at /home/mhiramat/ksrc/linux-3/kernel/locking/qspinlock_paravirt.h:137!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc1-01639-gabf9b5f #2
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffffffff81e2b700 ti: ffffffff81e00000 task.ti: ffffffff81e00000
RIP: 0010:[<ffffffff811152bc>] [<ffffffff811152bc>] __pv_queued_spin_unlock+0xd0/0x110
RSP: 0000:ffffffff81e07e60 EFLAGS: 00010202
RAX: 0000000000000010 RBX: ffffffff828e8c50 RCX: 0000000000000100
RDX: ffff88007fe8efc0 RSI: 00000000000000ff RDI: ffffffff828e8c50
RBP: ffffffff81e07e60 R08: ffff88007fe8eec0 R09: 0000000000000100
R10: 0000000000000000 R11: 2020202020202020 R12: 0000000000000000
R13: 0000000000000001 R14: ffffffff814603b1 R15: 0000ffffffff82c8
FS: 0000000000000000(0000) GS:ffff88006ce00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000ffffffff CR3: 0000000001e22000 CR4: 00000000000006b0
Stack:
ffffffff81e07ec8 ffffffff81114a59 2020202020202020 0000000000000000
0000000000000000 0000000000000000 ffffffff828e8c50 ffffffff81c8cf83
00000000000000b8 00000000000000b8 ffffffff81117133 00000000000000b8

Call Trace:
[<ffffffff81114a59>] __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e
[<ffffffff81117133>] ? do_raw_spin_unlock+0xfa/0x10c
[<ffffffff817cd3f7>] _raw_spin_unlock+0x44/0x64
[<ffffffff814603ee>] double_unlock_spin+0x3d/0x46
[<ffffffff817c1e42>] dotest+0x85/0x16e
[<ffffffff81471431>] locking_selftest+0x67d/0x2a80
[<ffffffff82c9062a>] start_kernel+0x5bc/0x874
[<ffffffff82c8fc1d>] ? set_init_arg+0xb6/0xb6
[<ffffffff82c8f120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff82c8f625>] x86_64_start_reservations+0x46/0x4f
[<ffffffff82c8f84c>] x86_64_start_kernel+0x21e/0x234
Code: 44 fe ca 75 62 eb 2d 48 ff c1 48 ff 05 de 73 c7 02 48 8d 14 01 48 21 f2 48 c1 e2 04 4c 01 c2 4c 39 c9 72 af 48 ff 05 d4 73 c7 02 <0f> 0b 48 ff 05 d3 73 c7 02 48 83 3d ab e8 d7 00 00 48 63 78 40
RIP [<ffffffff811152bc>] __pv_queued_spin_unlock+0xd0/0x110
RSP <ffffffff81e07e60>
---[ end trace ffa8b6c1f29ba3a3 ]---
Kernel panic - not syncing: Fatal exception
---[ end Kernel panic - not syncing: Fatal exception"

And here is the result of backtrace from gdb.

(gdb) bt
#0 __delay (loops=1) at /home/mhiramat/ksrc/linux-3/arch/x86/lib/delay.c:108
#1 0xffffffff8144f02a in __const_udelay (xloops=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/lib/delay.c:122
#2 0xffffffff817b6090 in panic (fmt=<optimized out>)
at /home/mhiramat/ksrc/linux-3/kernel/panic.c:201
#3 0xffffffff8101fecd in oops_end (flags=514,
regs=0xffffffff81e07db8 <init_thread_union+32184>, signr=11)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/dumpstack.c:249
#4 0xffffffff8102039f in die (str=0xffffffff81c6b82e "invalid opcode",
regs=0xffffffff81e07db8 <init_thread_union+32184>, err=0)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/dumpstack.c:316
#5 0xffffffff8101b33f in do_trap_no_signal (error_code=<optimized out>,
regs=<optimized out>, str=<optimized out>, trapnr=<optimized out>,
tsk=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:204
#6 do_trap (trapnr=<optimized out>, signr=<optimized out>,
str=0x0 <irq_stack_union>, regs=0x0 <irq_stack_union>, error_code=1,
info=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:250
#7 0xffffffff8101b8fe in do_error_trap (
regs=0xffffffff81e07db8 <init_thread_union+32184>, error_code=0,
str=0xffffffff81c6b82e "invalid opcode", trapnr=6, signr=4)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:289
#8 0xffffffff8101bf7d in do_invalid_op (regs=<optimized out>,
error_code=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:302
#9 0xffffffff817d004e in invalid_op ()
at /home/mhiramat/ksrc/linux-3/arch/x86/entry/entry_64.S:826
#10 0x0000ffffffff82c8 in ?? ()
#11 0xffffffff814603b1 in bad_unlock_order_spin ()
at /home/mhiramat/ksrc/linux-3/lib/locking-selftest.c:532
#12 0x0000000000000001 in irq_stack_union ()
#13 0x0000000000000000 in ?? ()

--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu.pt@xxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/