Re: lockdep warning: console vs. mem hotplug

From: Sebastian Ott
Date: Fri Mar 24 2017 - 12:58:04 EST


On Fri, 24 Mar 2017, Steven Rostedt wrote:
> Is there a stack trace of where the lockdep dump happened? That is
> useful too. Otherwise we don't see where the inverse happened.

Sure. Here you go.

[ 132.379220] ======================================================
[ 132.379220] [ INFO: possible circular locking dependency detected ]
[ 132.379221] 4.11.0-rc3 #330 Not tainted
[ 132.379221] -------------------------------------------------------
[ 132.379222] sh/708 is trying to acquire lock:
[ 132.379222] (&(&sclp_con_lock)->rlock){-.-...}, at: [<000000000074524a>] sclp_console_write+0x4a/0x2b8
[ 132.379224]
[ 132.379224] but task is already holding lock:
[ 132.379224] (&(&zone->lock)->rlock){..-...}, at: [<00000000002a65a8>] __offline_isolated_pages+0xe0/0x340
[ 132.379226]
[ 132.379227] which lock already depends on the new lock.
[ 132.379227]
[ 132.379227]
[ 132.379228] the existing dependency chain (in reverse order) is:
[ 132.379228]
[ 132.379229] -> #2 (&(&zone->lock)->rlock){..-...}:
[ 132.379230] validate_chain.isra.22+0xb36/0xd18
[ 132.379231] __lock_acquire+0x650/0x880
[ 132.379231] lock_acquire+0x24e/0x2b0
[ 132.379232] _raw_spin_lock_irqsave+0x70/0xb8
[ 132.379232] get_page_from_freelist+0x548/0xf30
[ 132.379233] __alloc_pages_nodemask+0x1c8/0x14c0
[ 132.379233] allocate_slab+0xde/0x668
[ 132.379233] new_slab+0x94/0xa8
[ 132.379234] ___slab_alloc.constprop.66+0x556/0x578
[ 132.379234] __slab_alloc.isra.62.constprop.65+0x74/0xa8
[ 132.379235] kmem_cache_alloc+0x136/0x4b8
[ 132.379235] __debug_object_init+0xaa/0x448
[ 132.379236] debug_object_activate+0x182/0x238
[ 132.379236] mod_timer+0xe6/0x450
[ 132.379236] prandom_reseed+0x44/0x50
[ 132.379237] do_one_initcall+0xa2/0x150
[ 132.379237] kernel_init_freeable+0x228/0x2d8
[ 132.379238] kernel_init+0x2a/0x138
[ 132.379238] kernel_thread_starter+0x6/0xc
[ 132.379239] kernel_thread_starter+0x0/0xc
[ 132.379239]
[ 132.379239] -> #1 (&(&base->lock)->rlock){-.-.-.}:
[ 132.379241] validate_chain.isra.22+0xb36/0xd18
[ 132.379242] __lock_acquire+0x650/0x880
[ 132.379242] lock_acquire+0x24e/0x2b0
[ 132.379242] _raw_spin_lock_irqsave+0x70/0xb8
[ 132.379243] lock_timer_base+0x78/0xa8
[ 132.379243] mod_timer+0xba/0x450
[ 132.379244] sclp_console_write+0x262/0x2b8
[ 132.379244] console_unlock+0x4a6/0x658
[ 132.379245] register_console+0x31c/0x430
[ 132.379245] sclp_console_init+0x152/0x170
[ 132.379245] console_init+0x3a/0x58
[ 132.379246] start_kernel+0x350/0x480
[ 132.379246] _stext+0x20/0x80
[ 132.379246]
[ 132.379247] -> #0 (&(&sclp_con_lock)->rlock){-.-...}:
[ 132.379249] check_prev_add+0x150/0x648
[ 132.379249] validate_chain.isra.22+0xb36/0xd18
[ 132.379249] __lock_acquire+0x650/0x880
[ 132.379250] lock_acquire+0x24e/0x2b0
[ 132.379250] _raw_spin_lock_irqsave+0x70/0xb8
[ 132.379251] sclp_console_write+0x4a/0x2b8
[ 132.379251] console_unlock+0x4a6/0x658
[ 132.379252] vprintk_emit+0x33e/0x350
[ 132.379252] vprintk_default+0x44/0x58
[ 132.379252] printk+0x4e/0x58
[ 132.379253] __offline_isolated_pages+0x1a0/0x340
[ 132.379253] offline_isolated_pages_cb+0x2e/0x40
[ 132.379254] walk_system_ram_range+0x96/0xe8
[ 132.379254] __offline_pages.constprop.43+0x6da/0x918
[ 132.379255] memory_subsys_offline+0x6a/0xa0
[ 132.379255] device_offline+0x84/0xd8
[ 132.379255] store_mem_state+0xfe/0x120
[ 132.379256] kernfs_fop_write+0x126/0x1d0
[ 132.379256] __vfs_write+0x46/0x140
[ 132.379257] vfs_write+0xb8/0x1a8
[ 132.379257] SyS_write+0x66/0xc0
[ 132.379257] system_call+0xc4/0x240
[ 132.379258]
[ 132.379258] other info that might help us debug this:
[ 132.379258]
[ 132.379259] Chain exists of:
[ 132.379259] &(&sclp_con_lock)->rlock --> &(&base->lock)->rlock --> &(&zone->lock)->rlock
[ 132.379262]
[ 132.379262] Possible unsafe locking scenario:
[ 132.379262]
[ 132.379263] CPU0 CPU1
[ 132.379263] ---- ----
[ 132.379264] lock(&(&zone->lock)->rlock);
[ 132.379265] lock(&(&base->lock)->rlock);
[ 132.379266] lock(&(&zone->lock)->rlock);
[ 132.379267] lock(&(&sclp_con_lock)->rlock);
[ 132.379268]
[ 132.379268] *** DEADLOCK ***
[ 132.379269]
[ 132.379269] 10 locks held by sh/708:
[ 132.379269] #0: (sb_writers#4){.+.+.+}, at: [<0000000000345d4a>] vfs_write+0xa2/0x1a8
[ 132.379272] #1: (&of->mutex){+.+.+.}, at: [<00000000003e5e60>] kernfs_fop_write+0xd8/0x1d0
[ 132.379273] #2: (s_active#180){.+.+.+}, at: [<00000000003e5e6c>] kernfs_fop_write+0xe4/0x1d0
[ 132.379275] #3: (device_hotplug_lock){+.+...}, at: [<000000000069bb58>] lock_device_hotplug_sysfs+0x30/0x68
[ 132.379277] #4: (memory_add_remove_lock){+.+...}, at: [<000000000031cfe6>] mem_hotplug_begin+0x3e/0xd0
[ 132.379279] #5: (mem_hotplug.lock){++++.+}, at: [<000000000031cfa8>] mem_hotplug_begin+0x0/0xd0
[ 132.379281] #6: (mem_hotplug.lock#2){+.+...}, at: [<000000000031d02c>] mem_hotplug_begin+0x84/0xd0
[ 132.379283] #7: (&dev->mutex){......}, at: [<000000000069d55c>] device_offline+0x5c/0xd8
[ 132.379285] #8: (&(&zone->lock)->rlock){..-...}, at: [<00000000002a65a8>] __offline_isolated_pages+0xe0/0x340
[ 132.379287] #9: (console_lock){+.+...}, at: [<00000000001bdc02>] vprintk_emit+0x332/0x350
[ 132.379289]
[ 132.379289] stack backtrace:
[ 132.379290] CPU: 1 PID: 708 Comm: sh Not tainted 4.11.0-rc3 #330
[ 132.379290] Hardware name: IBM 2827 H66 703 (LPAR)
[ 132.379290] Stack:
[ 132.379291] 000000006f5ab458 000000006f5ab4e8 0000000000000003 0000000000000000
[ 132.379293] 000000006f5ab588 000000006f5ab500 000000006f5ab500 0000000000000020
[ 132.379294] 0000000000929180 000000000000000a 0000000000000020 000000000000000a
[ 132.379296] 000000000000000c 000000006f5ab550 0000000000000000 0000000000000000
[ 132.379298] 04000000016e0b18 0000000000111da2 000000006f5ab4e8 000000006f5ab540
[ 132.379300] Call Trace:
[ 132.379300] ([<0000000000111c86>] show_trace+0x8e/0xe0)
[ 132.379301] [<0000000000111d58>] show_stack+0x80/0xd8
[ 132.379301] [<00000000005ff656>] dump_stack+0x96/0xd8
[ 132.379302] [<0000000000291572>] print_circular_bug+0x2f2/0x308
[ 132.379302] [<00000000001a6b98>] check_prev_add+0x150/0x648
[ 132.379303] [<00000000001a7bc6>] validate_chain.isra.22+0xb36/0xd18
[ 132.379303] [<00000000001a9240>] __lock_acquire+0x650/0x880
[ 132.379304] [<00000000001a9bee>] lock_acquire+0x24e/0x2b0
[ 132.379304] [<00000000008dcab0>] _raw_spin_lock_irqsave+0x70/0xb8
[ 132.379304] [<000000000074524a>] sclp_console_write+0x4a/0x2b8
[ 132.379305] [<00000000001bd71e>] console_unlock+0x4a6/0x658
[ 132.379305] [<00000000001bdc0e>] vprintk_emit+0x33e/0x350
[ 132.379306] [<00000000001bddec>] vprintk_default+0x44/0x58
[ 132.379306] [<0000000000291c8e>] printk+0x4e/0x58
[ 132.379307] [<00000000002a6668>] __offline_isolated_pages+0x1a0/0x340
[ 132.379307] [<000000000031cbbe>] offline_isolated_pages_cb+0x2e/0x40
[ 132.379308] [<0000000000147466>] walk_system_ram_range+0x96/0xe8
[ 132.379308] [<00000000008d2bba>] __offline_pages.constprop.43+0x6da/0x918
[ 132.379309] [<00000000006b9b2a>] memory_subsys_offline+0x6a/0xa0
[ 132.379309] [<000000000069d584>] device_offline+0x84/0xd8
[ 132.379310] [<00000000006ba146>] store_mem_state+0xfe/0x120
[ 132.379310] [<00000000003e5eae>] kernfs_fop_write+0x126/0x1d0
[ 132.379311] [<0000000000344b0e>] __vfs_write+0x46/0x140
[ 132.379311] [<0000000000345d60>] vfs_write+0xb8/0x1a8
[ 132.379312] [<000000000034742e>] SyS_write+0x66/0xc0
[ 132.379312] [<00000000008dda2c>] system_call+0xc4/0x240
[ 132.379313] INFO: lockdep is turned off.




[ 377.096545] ======================================================
[ 377.096545] [ INFO: possible circular locking dependency detected ]
[ 377.096546] 4.11.0-rc2-00238-ga722844 #120 Not tainted
[ 377.096546] -------------------------------------------------------
[ 377.096547] sh/848 is trying to acquire lock:
[ 377.096547] ((console_sem).lock){-.....}, at: [<00000000001a594e>] down_trylock+0x2e/0x68
[ 377.096549] but task is already holding lock:
[ 377.096550] (&(&zone->lock)->rlock){..-...}, at: [<00000000002ae3da>] __offline_isolated_pages+0x302/0x370
[ 377.096552] which lock already depends on the new lock.
[ 377.096553] the existing dependency chain (in reverse order) is:

[ 377.096554] -> #3 (&(&zone->lock)->rlock){..-...}:
[ 377.096556] validate_chain.isra.10+0xb7a/0xda8
[ 377.096557] __lock_acquire+0x62c/0x850
[ 377.096557] lock_acquire+0x254/0x2b8
[ 377.096558] _raw_spin_lock_irqsave+0x70/0xb8
[ 377.096558] get_page_from_freelist+0x4e2/0xf28
[ 377.096558] __alloc_pages_nodemask+0x1fa/0x1528
[ 377.096559] allocate_slab+0xf2/0x6c0
[ 377.096559] new_slab+0x94/0xa8
[ 377.096560] ___slab_alloc.constprop.22+0x570/0x590
[ 377.096560] __slab_alloc.isra.16.constprop.21+0x74/0xa8
[ 377.096561] kmem_cache_alloc+0x13c/0x4b0
[ 377.096561] __debug_object_init+0x5c/0x468
[ 377.096561] hrtimer_init+0x42/0x1d8
[ 377.096562] init_dl_task_timer+0x3a/0x58
[ 377.096562] __sched_fork.isra.5+0x7e/0xe8
[ 377.096563] init_idle+0x7a/0x278
[ 377.096563] fork_idle+0xa4/0xb8
[ 377.096564] idle_threads_init+0x64/0xc8
[ 377.096564] smp_init+0x2e/0x110
[ 377.096565] kernel_init_freeable+0x16c/0x2e0
[ 377.096565] kernel_init+0x2a/0x148
[ 377.096565] kernel_thread_starter+0x6/0xc
[ 377.096566] kernel_thread_starter+0x0/0xc

[ 377.096567] -> #2 (&rq->lock){-.-.-.}:
[ 377.096568] validate_chain.isra.10+0xb7a/0xda8
[ 377.096569] __lock_acquire+0x62c/0x850
[ 377.096569] lock_acquire+0x254/0x2b8
[ 377.096569] _raw_spin_lock+0x60/0xa0
[ 377.096570] task_fork_fair+0x54/0x118
[ 377.096570] sched_fork+0x13e/0x2a0
[ 377.096571] copy_process+0x642/0x1e80
[ 377.096571] _do_fork+0xc2/0x6d0
[ 377.096571] kernel_thread+0x4e/0x60
[ 377.096572] rest_init+0x42/0x170
[ 377.096572] start_kernel+0x464/0x478
[ 377.096572] _stext+0x20/0x80

[ 377.096573] -> #1 (&p->pi_lock){-.-.-.}:
[ 377.096574] validate_chain.isra.10+0xb7a/0xda8
[ 377.096575] __lock_acquire+0x62c/0x850
[ 377.096575] lock_acquire+0x254/0x2b8
[ 377.096576] _raw_spin_lock_irqsave+0x70/0xb8
[ 377.096576] try_to_wake_up+0x4a/0x5f8
[ 377.096576] up+0x58/0x70
[ 377.096577] __up_console_sem+0x5a/0x90
[ 377.096577] console_unlock+0x610/0x670
[ 377.096578] vprintk_emit+0x308/0x378
[ 377.096578] dev_vprintk_emit+0x1e6/0x200
[ 377.096578] dev_printk_emit+0x4e/0x60
[ 377.096579] __dev_printk+0x8e/0xb0
[ 377.096579] dev_warn+0x68/0x78
[ 377.096579] qeth_dev_portname_store+0x58/0x70
[ 377.096580] kernfs_fop_write+0x124/0x1c8
[ 377.096580] __vfs_write+0x36/0x140
[ 377.096580] vfs_write+0xb8/0x1a0
[ 377.096581] SyS_write+0x66/0xc0
[ 377.096581] system_call+0xc4/0x240

[ 377.096582] -> #0 ((console_sem).lock){-.....}:
[ 377.096583] check_prev_add+0x160/0x6e8
[ 377.096584] validate_chain.isra.10+0xb7a/0xda8
[ 377.096584] __lock_acquire+0x62c/0x850
[ 377.096584] lock_acquire+0x254/0x2b8
[ 377.096585] _raw_spin_lock_irqsave+0x70/0xb8
[ 377.096585] down_trylock+0x2e/0x68
[ 377.096586] __down_trylock_console_sem+0x4a/0xc0
[ 377.096586] console_trylock+0x32/0xc8
[ 377.096586] vprintk_emit+0x2fc/0x378
[ 377.096587] vprintk_default+0x44/0x58
[ 377.096587] printk+0x4e/0x60
[ 377.096587] __offline_isolated_pages+0x168/0x370
[ 377.096588] offline_isolated_pages_cb+0x2e/0x40
[ 377.096588] walk_system_ram_range+0x92/0xf0
[ 377.096589] __offline_pages.constprop.9+0x74a/0x988
[ 377.096589] memory_subsys_offline+0x6c/0xa0
[ 377.096589] device_offline+0x84/0xe0
[ 377.096590] store_mem_state+0xfe/0x120
[ 377.096590] kernfs_fop_write+0x124/0x1c8
[ 377.096590] __vfs_write+0x36/0x140
[ 377.096591] vfs_write+0xb8/0x1a0
[ 377.096591] SyS_write+0x66/0xc0
[ 377.096592] system_call+0xc4/0x240

[ 377.096592] other info that might help us debug this:

[ 377.096593] Chain exists of:
[ 377.096593] (console_sem).lock --> &rq->lock --> &(&zone->lock)->rlock

[ 377.096595] Possible unsafe locking scenario:

[ 377.096596] CPU0 CPU1
[ 377.096596] ---- ----
[ 377.096597] lock(&(&zone->lock)->rlock);
[ 377.096598] lock(&rq->lock);
[ 377.096599] lock(&(&zone->lock)->rlock);
[ 377.096600] lock((console_sem).lock);

[ 377.096601] *** DEADLOCK ***

[ 377.096601] 9 locks held by sh/848:
[ 377.096602] #0: (sb_writers#5){.+.+.+}, at: [<000000000034f9c2>] vfs_write+0xa2/0x1a0
[ 377.096604] #1: (&of->mutex){+.+.+.}, at: [<00000000003f1afe>] kernfs_fop_write+0xd6/0x1c8
[ 377.096605] #2: (s_active#51){.+.+.+}, at: [<00000000003f1b0a>] kernfs_fop_write+0xe2/0x1c8
[ 377.096607] #3: (device_hotplug_lock){+.+.+.}, at: [<00000000006adb70>] lock_device_hotplug_sysfs+0x30/0x70
[ 377.096609] #4: (memory_add_remove_lock){+.+.+.}, at: [<0000000000328e16>] mem_hotplug_begin+0x3e/0xd0
[ 377.096610] #5: (mem_hotplug.lock){++++++}, at: [<0000000000328dd8>] mem_hotplug_begin+0x0/0xd0
[ 377.096612] #6: (mem_hotplug.lock#2){+.+.+.}, at: [<0000000000328e5c>] mem_hotplug_begin+0x84/0xd0
[ 377.096614] #7: (&dev->mutex){......}, at: [<00000000006af41c>] device_offline+0x5c/0xe0
[ 377.096615] #8: (&(&zone->lock)->rlock){..-...}, at: [<00000000002ae3da>] __offline_isolated_pages+0x302/0x370

[ 377.096617] stack backtrace:
[ 377.096618] CPU: 0 PID: 848 Comm: sh Not tainted 4.11.0-rc2-00238-ga722844 #120
[ 377.096618] Hardware name: IBM 2827 H66 703 (z/VM)
[ 377.096619] Stack:
[ 377.096619] 000000003ed0f450 000000003ed0f4e0 0000000000000003 0000000000000000
[ 377.096620] 000000003ed0f580 000000003ed0f4f8 000000003ed0f4f8 0000000000000020
[ 377.096629] 00000000009441c0 0000000000000020 000000000000000a 000000000000000a
[ 377.096631] 000000000000000c 000000003ed0f548 0000000000000000 0000000000000000
[ 377.096633] 04000000016fa078 0000000000112338 000000003ed0f4e0 000000003ed0f538
[ 377.096635] Call Trace:
[ 377.096635] ([<0000000000112212>] show_trace+0x8a/0xe0)
[ 377.096636] [<00000000001122e8>] show_stack+0x80/0xd8
[ 377.096636] [<000000000060f5a6>] dump_stack+0x96/0xd8
[ 377.096637] [<00000000001a9d3c>] print_circular_bug+0x314/0x340
[ 377.096637] [<00000000001aacc0>] check_prev_add+0x160/0x6e8
[ 377.096638] [<00000000001abdc2>] validate_chain.isra.10+0xb7a/0xda8
[ 377.096638] [<00000000001ad7dc>] __lock_acquire+0x62c/0x850
[ 377.096639] [<00000000001ae20c>] lock_acquire+0x254/0x2b8
[ 377.096639] [<00000000008f63b8>] _raw_spin_lock_irqsave+0x70/0xb8
[ 377.096639] [<00000000001a594e>] down_trylock+0x2e/0x68
[ 377.096640] [<00000000001c1262>] __down_trylock_console_sem+0x4a/0xc0
[ 377.096641] [<00000000001c130a>] console_trylock+0x32/0xc8
[ 377.096641] [<00000000001c2c84>] vprintk_emit+0x2fc/0x378
[ 377.096641] [<00000000001c2edc>] vprintk_default+0x44/0x58
[ 377.096642] [<000000000029a24e>] printk+0x4e/0x60
[ 377.096642] [<00000000002ae240>] __offline_isolated_pages+0x168/0x370
[ 377.096643] [<00000000003289d6>] offline_isolated_pages_cb+0x2e/0x40
[ 377.096643] [<00000000001492fa>] walk_system_ram_range+0x92/0xf0
[ 377.096644] [<00000000008ec1fa>] __offline_pages.constprop.9+0x74a/0x988
[ 377.096644] [<00000000006cc9bc>] memory_subsys_offline+0x6c/0xa0
[ 377.096645] [<00000000006af444>] device_offline+0x84/0xe0
[ 377.096645] [<00000000006cd0ae>] store_mem_state+0xfe/0x120
[ 377.096645] [<00000000003f1b4c>] kernfs_fop_write+0x124/0x1c8
[ 377.096646] [<000000000034e736>] __vfs_write+0x36/0x140
[ 377.096646] [<000000000034f9d8>] vfs_write+0xb8/0x1a0
[ 377.096647] [<000000000035110e>] SyS_write+0x66/0xc0
[ 377.096647] [<00000000008f74ac>] system_call+0xc4/0x240
[ 377.096648] INFO: lockdep is turned off.