3.16.1 - general protection fault

From: Marcin Gibula
Date: Wed Sep 24 2014 - 12:44:18 EST


Hi,

I've been playing with 3.16 kernel on my test machine as a KVM
hypervisor and encountered the following crash twice (almost identical
backtraces).

Sep 24 09:39:31 dev4 kernel: general protection fault: 0000 [#1] SMP
Sep 24 09:39:31 dev4 kernel: Modules linked in: ip6table_filter
ip6_tables iptable_filter ebt_arp act_police cls_u32 sch_ingress
cls_fw sch_sfq sch_htb ebt_dnat ebt_ip ebtable_nat ebtables
iptable_raw xt_CT x86_pkg_temp_thermal mpt2sas raid_cla
ss scsi_transport_sas i2c_i801 igb i2c_algo_bit megaraid_sas
Sep 24 09:39:31 dev4 kernel: CPU: 7 PID: 1892 Comm: kworker/7:1 Not
tainted 3.16.1-gentoo #2
Sep 24 09:39:31 dev4 kernel: Hardware name: Supermicro
X9DRW-7TPF+/X9DRW-7TPF+, BIOS 3.0 07/24/2013
Sep 24 09:39:31 dev4 kernel: Workqueue: rpciod rpc_async_schedule
Sep 24 09:39:31 dev4 kernel: task: ffff88102865de80 ti:
ffff88102456c000 task.ti: ffff88102456c000
Sep 24 09:39:31 dev4 kernel: RIP: 0010:[<ffffffff8113421c>]
[<ffffffff8113421c>] __kmalloc_node_track_caller+0xac/0x100
Sep 24 09:39:31 dev4 kernel: RSP: 0018:ffff88102456f958 EFLAGS: 00010246
Sep 24 09:39:31 dev4 kernel: RAX: 0000000000000000 RBX:
ffff8804fa153600 RCX: 000000000040471f
Sep 24 09:39:31 dev4 kernel: RDX: 000000000040471e RSI:
0000000000000000 RDI: 00000000000147c0
Sep 24 09:39:31 dev4 kernel: RBP: ffff88102456f988 R08:
ffff88103fcf47c0 R09: ffffffff8151d1c9
Sep 24 09:39:31 dev4 kernel: R10: ffff88102456fc70 R11:
0000000000000000 R12: 002560400038002c
Sep 24 09:39:31 dev4 kernel: R13: ffff88103f803400 R14:
00000000ffffffff R15: 0000000000010220
Sep 24 09:39:31 dev4 kernel: FS: 0000000000000000(0000)
GS:ffff88103fce0000(0000) knlGS:0000000000000000
Sep 24 09:39:31 dev4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 24 09:39:31 dev4 kernel: CR2: 00007fcdf3508000 CR3:
0000001d41163000 CR4: 00000000001427e0
Sep 24 09:39:31 dev4 kernel: Stack:
Sep 24 09:39:31 dev4 kernel: ffffffff8151d1f9 ffff8804fa153600
ffff88102456f9e7 0000000000000020
Sep 24 09:39:31 dev4 kernel: 0000000000000800 00000000ffffffff
ffff88102456f9c8 ffffffff8151d11c
Sep 24 09:39:31 dev4 kernel: ffff88103fcf4790 ffff8804fa153600
0000000000000020 00000000ffffffff
Sep 24 09:39:31 dev4 kernel: Call Trace:
Sep 24 09:39:31 dev4 kernel: [<ffffffff8151d1f9>] ? __alloc_skb+0x79/0x1c0
Sep 24 09:39:31 dev4 kernel: [<ffffffff8151d11c>]
__kmalloc_reserve.isra.41+0x2c/0x90
Sep 24 09:39:31 dev4 kernel: [<ffffffff8151d1f9>] __alloc_skb+0x79/0x1c0
Sep 24 09:39:31 dev4 kernel: [<ffffffff81589fb4>] sk_stream_alloc_skb+0x34/0x100
Sep 24 09:39:31 dev4 kernel: [<ffffffff8158ad07>] tcp_sendmsg+0x667/0xd20
Sep 24 09:39:31 dev4 kernel: [<ffffffff815b11c2>] inet_sendmsg+0x42/0xb0
Sep 24 09:39:31 dev4 kernel: [<ffffffff81513380>] sock_sendmsg+0xa0/0xc0
Sep 24 09:39:31 dev4 kernel: [<ffffffff810a6f31>] ? load_balance+0x151/0x850
Sep 24 09:39:31 dev4 kernel: [<ffffffff815133d2>] kernel_sendmsg+0x32/0x40
Sep 24 09:39:31 dev4 kernel: [<ffffffff81622a09>] xs_send_kvec+0x89/0xa0
Sep 24 09:39:31 dev4 kernel: [<ffffffff81622e5e>] xs_sendpages+0x5e/0x1f0
Sep 24 09:39:31 dev4 kernel: [<ffffffff81623057>] xs_tcp_send_request+0x67/0x120
Sep 24 09:39:31 dev4 kernel: [<ffffffff81620cd0>] xprt_transmit+0x50/0x260
Sep 24 09:39:31 dev4 kernel: [<ffffffff8161dfb8>] call_transmit+0x1a8/0x250
Sep 24 09:39:31 dev4 kernel: [<ffffffff816255a6>] __rpc_execute+0x56/0x280
Sep 24 09:39:31 dev4 kernel: [<ffffffff8108a208>] ?
pwq_activate_delayed_work+0x28/0x40
Sep 24 09:39:31 dev4 kernel: [<ffffffff816257f1>] rpc_async_schedule+0x21/0x30
Sep 24 09:39:31 dev4 kernel: [<ffffffff8108c81b>] process_one_work+0x13b/0x390
Sep 24 09:39:31 dev4 kernel: [<ffffffff8108d17b>] worker_thread+0x11b/0x510
Sep 24 09:39:31 dev4 kernel: [<ffffffff8108d060>] ?
cancel_delayed_work_sync+0x10/0x10
Sep 24 09:39:31 dev4 kernel: [<ffffffff81092b44>] kthread+0xc4/0xe0
Sep 24 09:39:31 dev4 kernel: [<ffffffff81092a80>] ?
kthread_create_on_node+0x170/0x170

This machine is serving as NFS client and KVM hypervisor. I'm still
not sure how to trigger it reliably (right now, I just have to run
ubuntu instalation in VM guest multiple times and sometimes it
triggers).

Attaching .config, slub.s and disassembled __kmalloc_node_track_caller function.

--
mg
Dump of assembler code for function __kmalloc_node_track_caller:
0x0000000000005500 <+0>: push %rbp
0x0000000000005501 <+1>: mov %rsp,%rbp
0x0000000000005504 <+4>: push %r15
0x0000000000005506 <+6>: mov %esi,%r15d
0x0000000000005509 <+9>: push %r14
0x000000000000550b <+11>: mov %edx,%r14d
0x000000000000550e <+14>: push %r13
0x0000000000005510 <+16>: push %r12
0x0000000000005512 <+18>: push %rbx
0x0000000000005513 <+19>: sub $0x8,%rsp
0x0000000000005517 <+23>: cmp $0x2000,%rdi
0x000000000000551e <+30>: mov %rcx,-0x30(%rbp)
0x0000000000005522 <+34>: ja 0x55e4 <__kmalloc_node_track_caller+228>
0x0000000000005528 <+40>: callq 0x552d <__kmalloc_node_track_caller+45>
0x000000000000552d <+45>: cmp $0x10,%rax
0x0000000000005531 <+49>: mov %rax,%r13
0x0000000000005534 <+52>: jbe 0x558e <__kmalloc_node_track_caller+142>
0x0000000000005536 <+54>: mov 0x0(%r13),%r8
0x000000000000553a <+58>: add %gs:0x0,%r8
0x0000000000005543 <+67>: mov 0x8(%r8),%rdx
0x0000000000005547 <+71>: mov (%r8),%r12
0x000000000000554a <+74>: mov 0x10(%r8),%rax
0x000000000000554e <+78>: test %r12,%r12
0x0000000000005551 <+81>: je 0x556d <__kmalloc_node_track_caller+109>
0x0000000000005553 <+83>: test %rax,%rax
0x0000000000005556 <+86>: je 0x556d <__kmalloc_node_track_caller+109>
0x0000000000005558 <+88>: cmp $0xffffffff,%r14d
0x000000000000555c <+92>: je 0x55a0 <__kmalloc_node_track_caller+160>
0x000000000000555e <+94>: mov (%rax),%rax
0x0000000000005561 <+97>: shr $0x27,%rax
0x0000000000005565 <+101>: and $0x3f,%eax
0x0000000000005568 <+104>: cmp %eax,%r14d
0x000000000000556b <+107>: je 0x55a0 <__kmalloc_node_track_caller+160>
0x000000000000556d <+109>: mov -0x30(%rbp),%rcx
0x0000000000005571 <+113>: mov %r15d,%esi
0x0000000000005574 <+116>: mov %r14d,%edx
0x0000000000005577 <+119>: mov %r13,%rdi
0x000000000000557a <+122>: callq 0x557f <__kmalloc_node_track_caller+127>
0x000000000000557f <+127>: and $0x8000,%r15d
0x0000000000005586 <+134>: mov %rax,%r12
0x0000000000005589 <+137>: jne 0x55f2 <__kmalloc_node_track_caller+242>
0x000000000000558b <+139>: mov %r12,%rax
0x000000000000558e <+142>: add $0x8,%rsp
0x0000000000005592 <+146>: pop %rbx
0x0000000000005593 <+147>: pop %r12
0x0000000000005595 <+149>: pop %r13
0x0000000000005597 <+151>: pop %r14
0x0000000000005599 <+153>: pop %r15
0x000000000000559b <+155>: pop %rbp
0x000000000000559c <+156>: retq
0x000000000000559d <+157>: nopl (%rax)
0x00000000000055a0 <+160>: movslq 0x20(%r13),%rax
0x00000000000055a4 <+164>: lea 0x1(%rdx),%rcx
0x00000000000055a8 <+168>: mov 0x0(%r13),%rdi
0x00000000000055ac <+172>: mov (%r12,%rax,1),%rbx
0x00000000000055b0 <+176>: mov %r12,%rax
0x00000000000055b3 <+179>: lea (%rdi),%rsi
0x00000000000055b6 <+182>: callq 0x55bb <__kmalloc_node_track_caller+187>
0x00000000000055bb <+187>: test %al,%al
0x00000000000055bd <+189>: je 0x5536 <__kmalloc_node_track_caller+54>
0x00000000000055c3 <+195>: movslq 0x20(%r13),%rax
0x00000000000055c7 <+199>: and $0x8000,%r15d
0x00000000000055ce <+206>: prefetcht0 (%rbx,%rax,1)
0x00000000000055d2 <+210>: je 0x558b <__kmalloc_node_track_caller+139>
0x00000000000055d4 <+212>: movslq 0x1c(%r13),%rdx
0x00000000000055d8 <+216>: xor %esi,%esi
0x00000000000055da <+218>: mov %r12,%rdi
0x00000000000055dd <+221>: callq 0x55e2 <__kmalloc_node_track_caller+226>
0x00000000000055e2 <+226>: jmp 0x558b <__kmalloc_node_track_caller+139>
0x00000000000055e4 <+228>: callq 0x55e9 <__kmalloc_node_track_caller+233>
0x00000000000055e9 <+233>: nopl 0x0(%rax)
0x00000000000055f0 <+240>: jmp 0x558e <__kmalloc_node_track_caller+142>
0x00000000000055f2 <+242>: test %rax,%rax
0x00000000000055f5 <+245>: jne 0x55d4 <__kmalloc_node_track_caller+212>
0x00000000000055f7 <+247>: xor %eax,%eax
0x00000000000055f9 <+249>: jmp 0x558e <__kmalloc_node_track_caller+142>

Attachment: kernel.config
Description: Binary data

Attachment: slub.s
Description: Binary data