Re: kernel OOPS in MM(?)

From: Evgenii Lepikhin
Date: Wed Mar 16 2016 - 00:42:41 EST


Hello,

On 2016-03-10 12:31, Evgenii Lepikhin wrote:

> We need help to understand the source of the problem and may be to create a bugreport. Here is crash report:
>
> Mar 10 04:03:51 l28 kernel: [2075560.434445] BUG: unable to handle kernel paging request at 0000000040008021
> Mar 10 04:03:51 l28 kernel: [2075560.434669] IP: [<ffffffff810ee519>] __kmalloc+0x69/0x100
> Mar 10 04:03:51 l28 kernel: [2075560.434800] PGD b7e462067 PUD 0
> Mar 10 04:03:51 l28 kernel: [2075560.434913] Oops: 0000 [#1] SMP
> Mar 10 04:03:51 l28 kernel: [2075560.435044] Modules linked in:
> tcm_loop iscsi_target_mod target_core_pscsi target_core_file
> target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp
> libis
> csi_tcp libiscsi scsi_transport_iscsi fuse [last unloaded: ipfw_mod]
> Mar 10 04:03:51 l28 kernel: [2075560.435539] CPU: 4 PID: 27141 Comm: rm Tainted: G O 3.12.51-jl-2015-12-25 #1
> Mar 10 04:03:51 l28 kernel: [2075560.435734] Hardware name: Intel Corporation S2600IP ........../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
> Mar 10 04:03:51 l28 kernel: [2075560.435939] task: ffff880e622ccba0 ti: ffff880eeb008000 task.ti: ffff880eeb008000
> Mar 10 04:03:51 l28 kernel: [2075560.436131] RIP: 0010:[<ffffffff810ee519>] [<ffffffff810ee519>] __kmalloc+0x69/0x100
> Mar 10 04:03:51 l28 kernel: [2075560.436333] RSP: 0018:ffff880eeb009b38 EFLAGS: 00010282
> Mar 10 04:03:51 l28 kernel: [2075560.436439] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000a8a73dc2
> Mar 10 04:03:51 l28 kernel: [2075560.436632] RDX: 00000000a8a73dc1 RSI: 0000000000000000 RDI: 0000000000013500
> Mar 10 04:03:51 l28 kernel: [2075560.438248] RBP: ffff880eeb009b58 R08: ffff88103fc13500 R09: ffffffff811a0267
> Mar 10 04:03:51 l28 kernel: [2075560.438446] R10: ffff880eeb009d84 R11: 0000000000000000 R12: ffff88081f803a00
> Mar 10 04:03:51 l28 kernel: [2075560.438656] R13: 0000000040008021 R14: 0000000000000250 R15: ffff880250e833b0
> Mar 10 04:03:51 l28 kernel: [2075560.438851] FS: 00007fe2316dd700(0000) GS:ffff88103fc00000(0000) knlGS:0000000000000000
> Mar 10 04:03:51 l28 kernel: [2075560.439045] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 10 04:03:51 l28 kernel: [2075560.439152] CR2: 0000000040008021 CR3: 0000000a20736000 CR4: 00000000000407e0
> Mar 10 04:03:51 l28 kernel: [2075560.439343] Stack:
> Mar 10 04:03:51 l28 kernel: [2075560.439439] 0000000000000000 0000000000000250 0000000000000060 0000000000000000
> Mar 10 04:03:51 l28 kernel: [2075560.439663] ffff880eeb009b88 ffffffff811a0267 ffff881015fb7fe0 0000000000000060
> Mar 10 04:03:51 l28 kernel: [2075560.439898] ffff880250e83490 0000000000000000 ffff880eeb009ba8 ffffffff811a02f8
> Mar 10 04:03:51 l28 kernel: [2075560.440153] Call Trace:
> Mar 10 04:03:51 l28 kernel: [2075560.440257] [<ffffffff811a0267>] kmem_alloc+0x67/0xe0
> Mar 10 04:03:51 l28 kernel: [2075560.440365] [<ffffffff811a02f8>] kmem_zalloc+0x18/0x40
> Mar 10 04:03:51 l28 kernel: [2075560.440473] [<ffffffff811e0523>] xfs_log_commit_cil+0x373/0x4c0
> Mar 10 04:03:51 l28 kernel: [2075560.440585] [<ffffffff811aab00>] ? xfs_bmap_search_multi_extents+0xe0/0x110
> Mar 10 04:03:51 l28 kernel: [2075560.440783] [<ffffffff8119fb6c>] xfs_trans_commit+0x6c/0x250
> Mar 10 04:03:51 l28 kernel: [2075560.440899] [<ffffffff811875d7>] xfs_bmap_finish+0xb7/0x1a0

Another issue on the same server, same instruction pointer:

Mar 16 04:53:54 l28 kernel: [521052.387878] BUG: unable to handle kernel paging request at 0000000040008021
Mar 16 04:53:54 l28 kernel: [521052.388022] IP: [<ffffffff810ee519>] __kmalloc+0x69/0x100
Mar 16 04:53:54 l28 kernel: [521052.388171] PGD 0
Mar 16 04:53:54 l28 kernel: [521052.388289] Oops: 0000 [#1] SMP
Mar 16 04:53:54 l28 kernel: [521052.388410] Modules linked in: tcm_loop iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ipt_NETFLOW(O) configfs iscsi_tcp libis
csi_tcp libiscsi scsi_transport_iscsi fuse
Mar 16 04:53:54 l28 kernel: [521052.388913] CPU: 6 PID: 5947 Comm: iscsi_trx Tainted: G O 3.12.51-jl-2015-12-25 #1
Mar 16 04:53:54 l28 kernel: [521052.389125] Hardware name: Intel Corporation S2600IP ........../S2600IP, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
Mar 16 04:53:54 l28 kernel: [521052.389351] task: ffff88081a3a6720 ti: ffff8808162de000 task.ti: ffff8808162de000
Mar 16 04:53:54 l28 kernel: [521052.389566] RIP: 0010:[<ffffffff810ee519>] [<ffffffff810ee519>] __kmalloc+0x69/0x100
Mar 16 04:53:54 l28 kernel: [521052.389782] RSP: 0018:ffff8808162dfd18 EFLAGS: 00010286
Mar 16 04:53:54 l28 kernel: [521052.389899] RAX: 0000000000000000 RBX: ffff880819a51800 RCX: 0000000003b305d3
Mar 16 04:53:54 l28 kernel: [521052.390112] RDX: 0000000003b305d2 RSI: 0000000000000000 RDI: 0000000000013500
Mar 16 04:53:54 l28 kernel: [521052.390309] RBP: ffff8808162dfd38 R08: ffff88103fd13500 R09: ffffffffa00e7072
Mar 16 04:53:54 l28 kernel: [521052.390503] R10: 0000000000100000 R11: 0000000000000030 R12: ffff88081f803a00
Mar 16 04:53:54 l28 kernel: [521052.390694] R13: 0000000040008021 R14: 00000000000080d0 R15: ffff8808162dfdd0
Mar 16 04:53:54 l28 kernel: [521052.390888] FS: 0000000000000000(0000) GS:ffff88103fd00000(0000) knlGS:0000000000000000
Mar 16 04:53:54 l28 kernel: [521052.391082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 16 04:53:54 l28 kernel: [521052.391190] CR2: 0000000040008021 CR3: 000000000180b000 CR4: 00000000000407e0
Mar 16 04:53:54 l28 kernel: [521052.391382] Stack:
Mar 16 04:53:54 l28 kernel: [521052.391477] ffff880819a51800 ffff8808162dfe50 ffff880816319cc0 0000000000000006
Mar 16 04:53:54 l28 kernel: [521052.393132] ffff8808162dfeb8 ffffffffa00e7072 ffff8808162dfd78 ffff88103fd10880
Mar 16 04:53:54 l28 kernel: [521052.393345] 0000000000000000 ffff880800000000 0000000000001000 ffff880816318124
Mar 16 04:53:54 l28 kernel: [521052.393585] Call Trace:
Mar 16 04:53:54 l28 kernel: [521052.393703] [<ffffffffa00e7072>] iscsi_target_rx_thread+0x592/0xd80 [iscsi_target_mod]
Mar 16 04:53:54 l28 kernel: [521052.393909] [<ffffffff8148b200>] ? __schedule+0x2e0/0x8e0
Mar 16 04:53:54 l28 kernel: [521052.394052] [<ffffffffa00e6ae0>] ? iscsi_target_tx_thread+0x280/0x280 [iscsi_target_mod]
Mar 16 04:53:54 l28 kernel: [521052.394274] [<ffffffff8105c5cb>] kthread+0xbb/0xc0
Mar 16 04:53:54 l28 kernel: [521052.394392] [<ffffffff8105c510>] ? __kthread_parkme+0x80/0x80
Mar 16 04:53:54 l28 kernel: [521052.394508] [<ffffffff8148d818>] ret_from_fork+0x58/0x90
Mar 16 04:53:54 l28 kernel: [521052.394619] [<ffffffff8105c510>] ? __kthread_parkme+0x80/0x80
Mar 16 04:53:54 l28 kernel: [521052.394740] Code: 65 4c 03 04 25 48 bc 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 6b 48 85 c0 74 66 49 63 44 24 20 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd 49
Mar 16 04:53:54 l28 kernel: [521052.395596] RIP [<ffffffff810ee519>] __kmalloc+0x69/0x100
Mar 16 04:53:54 l28 kernel: [521052.395721] RSP <ffff8808162dfd18>
Mar 16 04:53:54 l28 kernel: [521052.395841] CR2: 0000000040008021
Mar 16 04:53:54 l28 kernel: [521052.396883] ---[ end trace 20dde4e477f78d24 ]---

>
> Kernel 3.12.51. Gdb listing:
>
> (gdb) list *(__kmalloc+0x69)
> 0xffffffff810ee519 is in __kmalloc (mm/slub.c:260).
> [...]
> 258 static inline void *get_freepointer(struct kmem_cache *s, void *object)
> 259 {
> 260 return *(void **)(object + s->offset);
> 261 }
>
> What whould be the next step? Thank you.

--
UNIX/Ocaml engineer at 1Gb.ru. Telegram: johnlepikhin