Re: WARNING: possible circular locking dependency detected

From: Sebastian Andrzej Siewior
Date: Fri Aug 25 2017 - 12:42:45 EST


On 2017-08-25 12:03:04 [+0200], Borislav Petkov wrote:
> Hey,
Hi Borislav,

> tglx says I have something for ya :-)
:)
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.13.0-rc6+ #1 Not tainted
> ------------------------------------------------------
> watchdog/3/27 is trying to acquire lock:
> (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8100c489>] release_ds_buffers+0x29/0xd0
>
> but now in release context of a crosslock acquired at the following:
> ((complete)&self->parked){+.+.}, at: [<ffffffff810895f6>] kthread_park+0x46/0x60
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 ((complete)&self->parked){+.+.}:
> __lock_acquire+0x10af/0x1110
> lock_acquire+0xea/0x1f0
> wait_for_completion+0x3b/0x130
> kthread_park+0x46/0x60
> __smpboot_create_thread.part.5+0x7d/0xf0
> smpboot_register_percpu_thread_cpumask+0xa2/0x100
> spawn_ksoftirqd+0x3b/0x45
> do_one_initcall+0x52/0x198
> kernel_init_freeable+0x6f/0x1a1
> kernel_init+0xe/0x100
> ret_from_fork+0x2a/0x40
>
> -> #1 (smpboot_threads_lock){+.+.}:
> __lock_acquire+0x10af/0x1110
> lock_acquire+0xea/0x1f0
> __mutex_lock+0x6c/0x940
> mutex_lock_nested+0x1b/0x20
> smpboot_register_percpu_thread_cpumask+0x42/0x100
> spawn_ksoftirqd+0x3b/0x45
> do_one_initcall+0x52/0x198
> kernel_init_freeable+0x6f/0x1a1
> kernel_init+0xe/0x100
> ret_from_fork+0x2a/0x40
>
> -> #0 (cpu_hotplug_lock.rw_sem){++++}:
> cpus_read_lock+0x2a/0x90
> release_ds_buffers+0x29/0xd0
> x86_release_hardware+0x8f/0xa0
> hw_perf_event_destroy+0xe/0x20
> _free_event+0xa7/0x250
>
> other info that might help us debug this:
>
> Chain exists of:
> cpu_hotplug_lock.rw_sem --> smpboot_threads_lock --> (complete)&self->parked
>
> Possible unsafe locking scenario by crosslock:
>
> CPU0 CPU1
> ---- ----
> lock(smpboot_threads_lock);
> lock((complete)&self->parked);
> lock(cpu_hotplug_lock.rw_sem);
> unlock((complete)&self->parked);
>
> *** DEADLOCK ***
>
> 1 lock held by watchdog/3/27:
> #0: (&x->wait){....}, at: [<ffffffff810b115d>] complete+0x1d/0x60
>
> stack backtrace:
> CPU: 3 PID: 27 Comm: watchdog/3 Not tainted 4.13.0-rc6+ #1
> Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
> Call Trace:
> dump_stack+0x86/0xcf
> print_circular_bug+0x1fa/0x2f0
> check_prev_add+0x3be/0x700
> ? __lock_acquire+0x4c6/0x1110
> ? trace_event_raw_event_lock+0xf0/0xf0
> lock_commit_crosslock+0x40d/0x590
> ? lock_commit_crosslock+0x40d/0x590
> complete+0x29/0x60
> __kthread_parkme+0x54/0x80
> kthread_parkme+0x24/0x40
> smpboot_thread_fn+0x95/0x230
> kthread+0x147/0x180
> ? sort_range+0x30/0x30
> ? kthread_create_on_node+0x40/0x40
> ret_from_fork+0x2a/0x40

I don't really get that one. I *think* it complains about the
cpu_hotplug_lock.rw_sem which was held while waiting for the completion
->parked (this is smpboot_register_percpu_thread() -> kthread_park() ->
wait_for_completion()) *and* then other case where kthread_park() waits
for the thread (while holding cpu_hotplug_lock.rw_sem) and out of
nowhere release_ds_buffers() comes on the CPU and asks for the
cpu_hotplug_lock.rw_sem. So something like this happens:

<...>-1 [002] .... 1.050786: kthread_park: watchdog/3
<...>-1 [002] .... 1.050794: kthread_park: waitâ <- wait_for_completion()
<...>-27 [003] .... 1.050844: release_ds_buffers: before
<...>-27 [003] .... 1.050857: release_ds_buffers: after
<...>-27 [003] .... 1.050905: release_ds_buffers: released
<...>-27 [003] .... 1.050908: __kthread_parkme: do complete <- complete()
<...>-27 [003] d..1 1.050916: print_circular_bug:
<...>-1 [002] .... 1.459364: kthread_park: waitâ done

We can change the timing by adding a printk to kthread_parkme() and then
lockdep won't see release_ds_buffers() within the wait_for_completion() section:

<...>-1 [003] .... 1.065596: kthread_park: watchdog/3
<...>-27 [003] .... 1.065657: release_ds_buffers: before
<...>-27 [003] .... 1.065669: release_ds_buffers: after
<...>-27 [003] .... 1.065718: release_ds_buffers: released
<...>-27 [003] .... 1.072390: __kthread_parkme: do complete <- complete
<...>-1 [003] .... 1.072402: kthread_park: waitâ <- wait_for_completion()
<...>-1 [003] .... 1.072404: kthread_park: waitâ done

different timing, no complains.
I don't think that this is a bug because we can have multiple readers of
the rwsem and *if* a someone is holding a write lock then
release_ds_buffers() would be simply deferred until smbboot_.* and the
write-lock holder are done.
If my interpretation of the situation is correct, how do we teach this
lockdep.

Sebastian