hotplug lockdep splat (tip)

From: Mike Galbraith
Date: Sun Sep 03 2017 - 03:10:36 EST



[ 126.626908] Unregister pv shared memory for cpu 1
[ 126.631041]
[ 126.631269] ======================================================
[ 126.632442] WARNING: possible circular locking dependency detected
[ 126.633599] 4.13.0.g06260ca-tip-lockdep #2 Tainted: G E
[ 126.634380] ------------------------------------------------------
[ 126.635124] stress-cpu-hotp/3156 is trying to acquire lock:
[ 126.635804] ((complete)&st->done){+.+.}, at: [<ffffffff8108d19a>] takedown_cpu+0x8a/0xf0
[ 126.636809]
[ 126.636809] but task is already holding lock:
[ 126.637567] (sparse_irq_lock){+.+.}, at: [<ffffffff81107ac7>] irq_lock_sparse+0x17/0x20
[ 126.638665]
[ 126.638665] which lock already depends on the new lock.
[ 126.638665]
[ 126.639679]
[ 126.639679] the existing dependency chain (in reverse order) is:
[ 126.641086]
[ 126.641086] -> #2 (sparse_irq_lock){+.+.}:
[ 126.641862] __lock_acquire+0x116a/0x11c0
[ 126.642456] lock_acquire+0xbd/0x200
[ 126.642975] __mutex_lock+0x70/0x8e0
[ 126.643499] mutex_lock_nested+0x1b/0x20
[ 126.644118] irq_lock_sparse+0x17/0x20
[ 126.644870] irq_affinity_online_cpu+0x18/0xd0
[ 126.645941] cpuhp_invoke_callback+0x1f6/0x830
[ 126.647000] cpuhp_up_callbacks+0x37/0xb0
[ 126.647961] cpuhp_thread_fun+0x14f/0x160
[ 126.648672] smpboot_thread_fn+0x11a/0x1e0
[ 126.649511] kthread+0x152/0x190
[ 126.650347] ret_from_fork+0x2a/0x40
[ 126.651239]
[ 126.651239] -> #1 (cpuhp_state){+.+.}:
[ 126.651956] cpuhp_thread_fun+0x54/0x160
[ 126.652523] smpboot_thread_fn+0x11a/0x1e0
[ 126.653233] kthread+0x152/0x190
[ 126.654090] ret_from_fork+0x2a/0x40
[ 126.656166] 0xffffffffffffffff
[ 126.657242]
[ 126.657242] -> #0 ((complete)&st->done){+.+.}:
[ 126.658407] check_prev_add+0x666/0x700
[ 126.659570] __lock_acquire+0x116a/0x11c0
[ 126.660576] lock_acquire+0xbd/0x200
[ 126.663278] wait_for_completion+0x53/0x1a0
[ 126.666632] takedown_cpu+0x8a/0xf0
[ 126.669845] cpuhp_invoke_callback+0x1f6/0x830
[ 126.670623] cpuhp_down_callbacks+0x42/0x80
[ 126.673508] _cpu_down+0xc6/0x100
[ 126.674323] do_cpu_down+0x3e/0x60
[ 126.674825] cpu_down+0x10/0x20
[ 126.675306] cpu_subsys_offline+0x14/0x20
[ 126.675895] device_offline+0x8c/0xb0
[ 126.676561] online_store+0x40/0x80
[ 126.677081] dev_attr_store+0x18/0x30
[ 126.677674] sysfs_kf_write+0x44/0x60
[ 126.678338] kernfs_fop_write+0x113/0x1a0
[ 126.678922] __vfs_write+0x28/0x150
[ 126.679553] vfs_write+0xc7/0x1c0
[ 126.680034] SyS_write+0x49/0xa0
[ 126.680563] entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 126.681459]
[ 126.681459] other info that might help us debug this:
[ 126.681459]
[ 126.683271] Chain exists of:
[ 126.683271] (complete)&st->done --> cpuhp_state --> sparse_irq_lock
[ 126.683271]
[ 126.687307] Possible unsafe locking scenario:
[ 126.687307]
[ 126.688244] CPU0 CPU1
[ 126.689027] ---- ----
[ 126.689591] lock(sparse_irq_lock);
[ 126.690036] lock(cpuhp_state);
[ 126.690970] lock(sparse_irq_lock);
[ 126.692057] lock((complete)&st->done);
[ 126.693199]
[ 126.693199] *** DEADLOCK ***
[ 126.693199]
[ 126.694283] 8 locks held by stress-cpu-hotp/3156:
[ 126.694772] #0: (sb_writers#4){.+.+}, at: [<ffffffff8129b226>] vfs_write+0x196/0x1c0
[ 126.696285] #1: (&of->mutex){+.+.}, at: [<ffffffff81334fc3>] kernfs_fop_write+0xe3/0x1a0
[ 126.697178] #2: (s_active#158){.+.+}, at: [<ffffffff81334fcb>] kernfs_fop_write+0xeb/0x1a0
[ 126.698367] #3: (device_hotplug_lock){+.+.}, at: [<ffffffff8158f255>] lock_device_hotplug_sysfs+0x15/0x40
[ 126.699632] #4: (&dev->mutex){....}, at: [<ffffffff81590aaa>] device_offline+0x4a/0xb0
[ 126.701229] #5: (cpu_add_remove_lock){+.+.}, at: [<ffffffff8108e587>] do_cpu_down+0x27/0x60
[ 126.703277] #6: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff810ed7c6>] percpu_down_write+0x26/0x110
[ 126.706302] #7: (sparse_irq_lock){+.+.}, at: [<ffffffff81107ac7>] irq_lock_sparse+0x17/0x20
[ 126.708966]
[ 126.708966] stack backtrace:
[ 126.709558] CPU: 0 PID: 3156 Comm: stress-cpu-hotp Tainted: G E 4.13.0.g06260ca-tip-lockdep #2
[ 126.710790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[ 126.712075] Call Trace:
[ 126.712523] dump_stack+0x85/0xc5
[ 126.712904] print_circular_bug+0x2e9/0x2f7
[ 126.713308] ? copy_trace+0xc0/0xc0
[ 126.713636] check_prev_add+0x666/0x700
[ 126.713978] ? copy_trace+0xc0/0xc0
[ 126.714426] ? copy_trace+0xc0/0xc0
[ 126.714847] __lock_acquire+0x116a/0x11c0
[ 126.715452] lock_acquire+0xbd/0x200
[ 126.716073] ? takedown_cpu+0x8a/0xf0
[ 126.716930] wait_for_completion+0x53/0x1a0
[ 126.717805] ? takedown_cpu+0x8a/0xf0
[ 126.718352] ? stop_machine_cpuslocked+0xee/0x100
[ 126.718867] ? cpuhp_invoke_callback+0x830/0x830
[ 126.719489] ? cpuhp_complete_idle_dead+0x20/0x20
[ 126.720109] takedown_cpu+0x8a/0xf0
[ 126.720597] cpuhp_invoke_callback+0x1f6/0x830
[ 126.721208] cpuhp_down_callbacks+0x42/0x80
[ 126.721805] _cpu_down+0xc6/0x100
[ 126.722292] do_cpu_down+0x3e/0x60
[ 126.722773] cpu_down+0x10/0x20
[ 126.723207] cpu_subsys_offline+0x14/0x20
[ 126.723762] device_offline+0x8c/0xb0
[ 126.724233] online_store+0x40/0x80
[ 126.724831] dev_attr_store+0x18/0x30
[ 126.725520] sysfs_kf_write+0x44/0x60
[ 126.726255] kernfs_fop_write+0x113/0x1a0
[ 126.727038] __vfs_write+0x28/0x150
[ 126.727753] ? rcu_read_lock_sched_held+0x98/0xa0
[ 126.728503] ? rcu_sync_lockdep_assert+0x2f/0x60
[ 126.729106] ? __sb_start_write+0x14e/0x1e0
[ 126.729509] ? vfs_write+0x196/0x1c0
[ 126.729974] ? security_file_permission+0x3b/0xc0
[ 126.730472] vfs_write+0xc7/0x1c0
[ 126.731070] ? trace_hardirqs_on_caller+0xf9/0x190
[ 126.731655] SyS_write+0x49/0xa0
[ 126.732054] entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 126.732666] RIP: 0033:0x7f1e159502d0
[ 126.733056] RSP: 002b:00007ffc1190efa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 126.734263] RAX: ffffffffffffffda RBX: 00007f1e15c10678 RCX: 00007f1e159502d0
[ 126.734973] RDX: 0000000000000002 RSI: 00007f1e164b7000 RDI: 0000000000000001
[ 126.735855] RBP: 00007f1e15c10620 R08: 000000000000000a R09: 00007f1e16469700
[ 126.736842] R10: 0000000001720c40 R11: 0000000000000246 R12: 0000000000000110
[ 126.738137] R13: 00000000000000e4 R14: 0000000000002710 R15: 00000000000000f1