hotplug lockdep splat (tip-rt)
From: Mike Galbraith
Date: Sat Sep 02 2017 - 07:10:26 EST
4.11-rt rolled forward, sprinkle liberally with seasoning of choice.
[ 7514.772861] ======================================================
[ 7514.772862] WARNING: possible circular locking dependency detected
[ 7514.772863] 4.13.0.g06260ca-rt11-tip-lockdep #20 Tainted: G E
[ 7514.772863] ------------------------------------------------------
[ 7514.772867] stress-cpu-hotp/4102 is trying to acquire lock:
[ 7514.772867] ((complete)&st->done){+.+.}, at: [<ffffffff8107208a>] takedown_cpu+0x9a/0x120
[ 7514.772877]
[ 7514.772877] but task is already holding lock:
[ 7514.772877] (sparse_irq_lock){+.+.}, at: [<ffffffff8107203a>] takedown_cpu+0x4a/0x120
[ 7514.772879]
[ 7514.772879] which lock already depends on the new lock.
[ 7514.772879]
[ 7514.772879]
[ 7514.772879] the existing dependency chain (in reverse order) is:
[ 7514.772880]
[ 7514.772880] -> #2 (sparse_irq_lock){+.+.}:
[ 7514.772889] lock_acquire+0xbd/0x250
[ 7514.772908] _mutex_lock+0x31/0x50
[ 7514.772913] irq_affinity_online_cpu+0x13/0xc0
[ 7514.772914] cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.772914] cpuhp_up_callbacks+0x30/0xb0
[ 7514.772915] cpuhp_thread_fun+0x159/0x170
[ 7514.772918] smpboot_thread_fn+0x268/0x310
[ 7514.772919] kthread+0x145/0x180
[ 7514.772921] ret_from_fork+0x2a/0x40
[ 7514.772922]
[ 7514.772922] -> #1 (cpuhp_state){+.+.}:
[ 7514.772926] smpboot_thread_fn+0x268/0x310
[ 7514.772927] kthread+0x145/0x180
[ 7514.772928] ret_from_fork+0x2a/0x40
[ 7514.772930] 0xffffffffffffffff
[ 7514.772930]
[ 7514.772930] -> #0 ((complete)&st->done){+.+.}:
[ 7514.772932] __lock_acquire+0x113b/0x1190
[ 7514.772933] lock_acquire+0xbd/0x250
[ 7514.772934] wait_for_completion+0x51/0x120
[ 7514.772935] takedown_cpu+0x9a/0x120
[ 7514.772936] cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.772937] cpuhp_down_callbacks+0x3b/0x80
[ 7514.772939] _cpu_down+0xba/0xf0
[ 7514.772940] do_cpu_down+0x35/0x50
[ 7514.772949] device_offline+0x7d/0xa0
[ 7514.772950] online_store+0x3a/0x70
[ 7514.772959] kernfs_fop_write+0x10a/0x190
[ 7514.772962] __vfs_write+0x23/0x150
[ 7514.772963] vfs_write+0xc2/0x1c0
[ 7514.772964] SyS_write+0x45/0xa0
[ 7514.772965] entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7514.772966]
[ 7514.772966] other info that might help us debug this:
[ 7514.772966]
[ 7514.772966] Chain exists of:
[ 7514.772966] (complete)&st->done --> cpuhp_state --> sparse_irq_lock
[ 7514.772966]
[ 7514.772968] Possible unsafe locking scenario:
[ 7514.772968]
[ 7514.772968] CPU0 CPU1
[ 7514.772968] ---- ----
[ 7514.772968] lock(sparse_irq_lock);
[ 7514.772969] lock(cpuhp_state);
[ 7514.772970] lock(sparse_irq_lock);
[ 7514.772970] lock((complete)&st->done);
[ 7514.772971]
[ 7514.772971] *** DEADLOCK ***
[ 7514.772971]
[ 7514.772972] 8 locks held by stress-cpu-hotp/4102:
[ 7514.772972] #0: (sb_writers#4){.+.+}, at: [<ffffffff8126c410>] vfs_write+0x190/0x1c0
[ 7514.772974] #1: (&of->mutex){+.+.}, at: [<ffffffff8130187a>] kernfs_fop_write+0xda/0x190
[ 7514.772976] #2: (s_active#140){.+.+}, at: [<ffffffff81301882>] kernfs_fop_write+0xe2/0x190
[ 7514.772979] #3: (device_hotplug_lock){+.+.}, at: [<ffffffff8153aa71>] lock_device_hotplug_sysfs+0x11/0x40
[ 7514.772981] #4: (&dev->mutex){....}, at: [<ffffffff8153c20f>] device_offline+0x3f/0xa0
[ 7514.772983] #5: (cpu_add_remove_lock){+.+.}, at: [<ffffffff8107352f>] do_cpu_down+0x1f/0x50
[ 7514.772985] #6: (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff810d2101>] percpu_down_write+0x21/0x110
[ 7514.772987] #7: (sparse_irq_lock){+.+.}, at: [<ffffffff8107203a>] takedown_cpu+0x4a/0x120
[ 7514.772989]
[ 7514.772989] stack backtrace:
[ 7514.772990] CPU: 5 PID: 4102 Comm: stress-cpu-hotp Tainted: G E 4.13.0.g06260ca-rt11-tip-lockdep #20
[ 7514.772991] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[ 7514.772992] Call Trace:
[ 7514.772995] dump_stack+0x7c/0xbf
[ 7514.772997] print_circular_bug+0x2d3/0x2e0
[ 7514.772999] ? copy_trace+0xb0/0xb0
[ 7514.773001] check_prev_add+0x666/0x700
[ 7514.773002] ? copy_trace+0xb0/0xb0
[ 7514.773008] ? __stop_cpus+0x51/0x70
[ 7514.773010] ? copy_trace+0xb0/0xb0
[ 7514.773011] __lock_acquire+0x113b/0x1190
[ 7514.773013] ? trace_hardirqs_on_caller+0xf2/0x1a0
[ 7514.773015] lock_acquire+0xbd/0x250
[ 7514.773018] ? takedown_cpu+0x9a/0x120
[ 7514.773020] wait_for_completion+0x51/0x120
[ 7514.773021] ? takedown_cpu+0x9a/0x120
[ 7514.773022] ? cpuhp_invoke_callback+0x9c0/0x9c0
[ 7514.773023] takedown_cpu+0x9a/0x120
[ 7514.773025] ? cpuhp_complete_idle_dead+0x10/0x10
[ 7514.773026] cpuhp_invoke_callback+0x24c/0x9c0
[ 7514.773028] cpuhp_down_callbacks+0x3b/0x80
[ 7514.773030] _cpu_down+0xba/0xf0
[ 7514.773031] do_cpu_down+0x35/0x50
[ 7514.773033] device_offline+0x7d/0xa0
[ 7514.773034] online_store+0x3a/0x70
[ 7514.773036] kernfs_fop_write+0x10a/0x190
[ 7514.773037] __vfs_write+0x23/0x150
[ 7514.773039] ? rcu_read_lock_sched_held+0x9b/0xb0
[ 7514.773043] ? rcu_sync_lockdep_assert+0x2d/0x60
[ 7514.773045] ? __sb_start_write+0x190/0x240
[ 7514.773046] ? vfs_write+0x190/0x1c0
[ 7514.773048] vfs_write+0xc2/0x1c0
[ 7514.773050] SyS_write+0x45/0xa0
[ 7514.773051] entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7514.773053] RIP: 0033:0x7fc5e51fd2d0
[ 7514.773053] RSP: 002b:00007ffd73d17678 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 7514.773054] RAX: ffffffffffffffda RBX: 00007fc5e54bd678 RCX: 00007fc5e51fd2d0
[ 7514.773055] RDX: 0000000000000002 RSI: 00007fc5e5d64000 RDI: 0000000000000001
[ 7514.773056] RBP: 00007fc5e54bd620 R08: 000000000000000a R09: 00007fc5e5d16700
[ 7514.773056] R10: 000000000198bc50 R11: 0000000000000246 R12: 0000000000000110
[ 7514.773057] R13: 00000000000000e4 R14: 0000000000002710 R15: 00000000000000f1