BUG: hard lockups on docker operations

From: Torsten Luettgert
Date: Fri Mar 11 2016 - 06:37:43 EST


Hello kernel hackers,

I'm getting hard lockups with kernel 4.4.5 (at least 4.4.3 and 4.4.4
also). This is a docker hypervisor machine with overlayfs and 3ware
RAID.

They usually happen when I do something with docker, here when I ran
poweroff in a container (those are all VM-like containers which run /sbin/init
inside).

Oops follows:

NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
Kernel panic - not syncing: Hard LOCKUP
CPU: 4 PID: 26116 Comm: plymouthd Not tainted 4.4.5 #1
Hardware name: Supermicro X8DTT/X8DTT, BIOS 2.1c 04/22/2014
0000000000000000 ffff880c3fc05b70 ffffffff814328e7 ffffffff81cf26a2
0000000000000000 ffff880c3fc05be8 ffffffff811b2afc ffff880c00000008
ffff880c3fc05bf8 ffff880c3fc05b98 0000000000000000 0000000000000046
Call Trace:
<NMI> [<ffffffff814328e7>] dump_stack+0x63/0x8c
[<ffffffff811b2afc>] panic+0xc8/0x20f
[<ffffffff81173880>] watchdog_overflow_callback+0xe0/0xe0
[<ffffffff811af4d8>] __perf_event_overflow+0x88/0x1c0
[<ffffffff811aff74>] perf_event_overflow+0x14/0x20
[<ffffffff810773ec>] intel_pmu_handle_irq+0x1cc/0x430
[<ffffffff814344c9>] ? ioremap_page_range+0x299/0x410
[<ffffffff811f082c>] ? vunmap_page_range+0x1dc/0x310
[<ffffffff811f0971>] ? unmap_kernel_range_noflush+0x11/0x20
[<ffffffff814d8e96>] ? ghes_copy_tofrom_phys+0x116/0x1f0
[<ffffffff814d8fe6>] ? ghes_read_estatus+0x76/0x150
[<ffffffff8106dfc8>] perf_event_nmi_handler+0x28/0x50
[<ffffffff8105e0d1>] nmi_handle+0x61/0x120
[<ffffffff8105e3dd>] default_do_nmi+0xad/0xf0
[<ffffffff8105e501>] do_nmi+0xe1/0x150
[<ffffffff818cec11>] end_repeat_nmi+0x1a/0x1e
[<ffffffff815345cf>] ? qi_submit_sync+0x17f/0x3e0
[<ffffffff815345cf>] ? qi_submit_sync+0x17f/0x3e0
[<ffffffff815345cf>] ? qi_submit_sync+0x17f/0x3e0
<<EOE>> [<ffffffff8153b85f>] modify_irte+0xaf/0x140
[<ffffffff8153b936>] intel_irq_remapping_activate+0x16/0x20
[<ffffffff81131bc1>] irq_domain_activate_irq+0x41/0x50
[<ffffffff81131bab>] irq_domain_activate_irq+0x2b/0x50
[<ffffffff8112f355>] irq_startup+0x35/0x80
[<ffffffff8112dec8>] __setup_irq+0x528/0x5c0
[<ffffffff815100c0>] ? serial8250_backup_timeout+0x120/0x120
[<ffffffff8112e0e4>] request_threaded_irq+0xf4/0x1b0
[<ffffffff81511041>] univ8250_setup_irq+0x231/0x270
[<ffffffff8151404f>] serial8250_do_startup+0x12f/0x650
[<ffffffff81514595>] serial8250_startup+0x25/0x30
[<ffffffff8150e005>] uart_startup.part.16+0x85/0x1c0
[<ffffffff8150e25b>] uart_open+0x11b/0x170
[<ffffffff814f2ac1>] tty_open+0x101/0x620
[<ffffffff81547a5d>] ? kobj_lookup+0x10d/0x160
[<ffffffff81225eba>] chrdev_open+0xaa/0x170
[<ffffffff8121f897>] do_dentry_open+0x227/0x320
[<ffffffff81225e10>] ? cdev_put+0x30/0x30
[<ffffffff81220907>] vfs_open+0x57/0x60
[<ffffffff8122dff1>] path_openat+0x181/0x1140
[<ffffffff8123073e>] do_filp_open+0x7e/0xd0
[<ffffffff8123d01f>] ? __alloc_fd+0x3f/0x170
[<ffffffff81220c68>] do_sys_open+0x128/0x210
[<ffffffff81220d6e>] SyS_open+0x1e/0x20
[<ffffffff818cca2e>] entry_SYSCALL_64_fastpath+0x12/0x71
Shutting down cpus with NMI
Kernel Offset: disabled

If more info is needed, I'll happily provide it.

Regards,
Torsten