4.11-rc7 crash when booting with 1 numa node?
From: Darrick J. Wong
Date: Sun Apr 16 2017 - 20:26:58 EST
Hi,
When booting 4.11-rc7 on a qemu guest with a single numa node, I hit the
following[1] crash on boot. If I configure more than one node, the
problem goes away. I tracked the relevant line down to:
(gdb) l *(irq_create_affinity_masks+0x237)
0xffffffff81103a87 is in irq_create_affinity_masks
(/raid/home/djwong/cdev/work/linux-xfs/kernel/irq/affinity.c:111).
106 /* Calculate the number of cpus per vector */
107 ncpus = cpumask_weight(nmsk);
108 vecs_to_assign = min(vecs_per_node, ncpus);
109
110 /* Account for rounding errors */
111 extra_vecs = ncpus - vecs_to_assign * (ncpus / vecs_to_assign);
112
113 for (v = 0; curvec < last_affv && v < vecs_to_assign;
114 curvec++, v++) {
115 cpus_per_vec = ncpus / vecs_to_assign;
Not sure exactly what's going on here; I can look into it more tomorrow
at work but maybe this rings a bell already? The line in question was
last changed by commit 3412386b53 ("irq/affinity: Fix extra vecs
calculation"). :)
Thanks,
--Darrick
[1] relevant dmesg:
[ 0.823932] divide error: 0000 [#1] PREEMPT SMP
[ 0.824818] Modules linked in:
[ 0.825469] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc7-xfsx #1
[ 0.826673] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 0.828137] task: ffff880078ff4f80 task.stack: ffffc9000031c000
[ 0.829096] RIP: 0010:irq_create_affinity_masks+0x237/0x360
[ 0.830004] RSP: 0000:ffffc9000031faa0 EFLAGS: 00010297
[ 0.830849] RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000000004
[ 0.831988] RDX: 0000000000000000 RSI: 0000000000000040 RDI: 000000000000000f
[ 0.833137] RBP: ffffc9000031fb10 R08: 0000000000000000 R09: 0000000000000001
[ 0.834296] R10: 0000000000000000 R11: 0000000000000001 R12: ffffc9000031fcb8
[ 0.835453] R13: 0000000000000004 R14: 000000000000a018 R15: 0000000000000002
[ 0.836589] FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
[ 0.837948] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.838881] CR2: 00000000ffffffff CR3: 0000000001c11000 CR4: 00000000000006e0
[ 0.840029] Call Trace:
[ 0.840454] __pci_enable_msix+0x316/0x4c0
[ 0.841133] pci_alloc_irq_vectors_affinity+0xd6/0x170
[ 0.842020] vp_find_vqs_msix+0xee/0x470
[ 0.842739] vp_find_vqs+0x36/0x180
[ 0.843490] ? __kmalloc+0x27c/0x2e0
[ 0.844095] virtscsi_init+0xfd/0x280
[ 0.844718] ? vp_get+0x59/0x80
[ 0.845266] virtscsi_probe+0xeb/0x2f0
[ 0.845886] virtio_dev_probe+0x19d/0x200
[ 0.846547] driver_probe_device+0x204/0x2e0
[ 0.847272] __driver_attach+0x9f/0xb0
[ 0.847878] ? driver_probe_device+0x2e0/0x2e0
[ 0.848750] bus_for_each_dev+0x66/0xa0
[ 0.849419] driver_attach+0x1e/0x20
[ 0.850078] bus_add_driver+0x1b4/0x230
[ 0.850709] ? scsi_init_procfs+0x5b/0x5b
[ 0.851409] driver_register+0x60/0xe0
[ 0.852036] ? scsi_init_procfs+0x5b/0x5b
[ 0.852707] register_virtio_driver+0x20/0x30
[ 0.853456] init+0x85/0xcc
[ 0.853930] do_one_initcall+0x53/0x1b0
[ 0.854585] ? parse_args+0x26a/0x3f0
[ 0.855238] kernel_init_freeable+0x1d9/0x25c
[ 0.855947] ? rest_init+0x140/0x140
[ 0.856553] kernel_init+0xe/0x100
[ 0.857125] ret_from_fork+0x31/0x40
[ 0.857710] Code: c5 80 2a 09 82 48 89 cf 48 89 4d d0 f3 48 0f b8 c7 48 89 c1 8b 45 a0 44 29 f8 99 f7 7d a8 39 c8 0f 4f c1 89 c3 89 45 bc 89 c8 99 <f7> fb 44 39 7d c4 89 55 b8 89 45 ac 0f 8e b4 00 00 00 85 db 0f
[ 0.861065] RIP: irq_create_affinity_masks+0x237/0x360 RSP: ffffc9000031faa0
[ 0.862531] ---[ end trace e472b3c89b58d381 ]---
[ 0.863311] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.863311]
[ 0.864840] Kernel Offset: disabled
[ 0.865433] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.865433]