Re: deadlock in debugfs synchronize_srcu() when unplugging USB

From: Paul E. McKenney
Date: Thu Oct 12 2017 - 17:04:58 EST


On Thu, Oct 12, 2017 at 04:01:48PM -0400, Tyler Hall wrote:
> Hi,
>
> I have a reproducible scenario wherein removing a USB device while
> reading /sys/kernel/debug/usb/devices causes a deadlock. This should
> not be specific to any USB device. Any USB device removal that causes
> a call to debugfs_remove() has inverted lock ordering with respect to
> the read() of debug/usb/devices.
>
> e.g.
> read thread: srcu_read_lock(&debugfs_srcu);
> -- usb unplug --
> remove thread: mutex_lock(&usb_bus_idr_lock);
> remove thread: synchronize_srcu(&debugfs_srcu); <- blocked
> read thread: mutex_lock(&usb_bus_idr_lock); <- blocked
> read thread: srcu_read_unlock(&debugfs_srcu, ...);

The reader cannot exit its SRCU read-side critical section until it
acquires usb_bus_idr_lock. The updater's synchronize_srcu() is not
permitted to return until all pre-existing readers complete, and it won't
release usb_bus_idr_lock until that happens. So you have a deadlock,
pure and simple.

Use of RCU and SRCU can greatly reduce the possibility of deadlock,
but as you can see, sufficiently clever code can still manage to get
into a deadlock state.

The rule is "Within a read-side critical section, never wait on anything
that directly or indirectly waits on a grace period." The above code
violates that rule, and so the above code needs to be fixed.

> This seems to be another flavor of what Johannes Berg reported:
> deadlock in synchronize_srcu() in debugfs?
> https://lkml.org/lkml/2017/3/23/415

It does look quite similar.

> I applied this patch set from Nicolai Stange and can no longer
> reproduce the hang.
> [RFC PATCH v2 0/9] debugfs: per-file removal protection
> https://lkml.org/lkml/2017/5/3/292
>
> As patch 2/9 in the series indicates, commit 49d200deaa68 ("debugfs:
> prevent access to removed files' private data") is where this was
> first introduced, and it is reproducible on v4.14-rc4.
>
> How should we move forward with the resolution of this debugfs change?
> It seems to me that the USB locking is reasonable but the debugfs
> global srcu is overly restrictive. This could lead to unexpected lock
> inversion any time a driver shares a mutex between its debugfs read
> and removal paths.

It looks like no one took Nicolai's series and that Nicolai never
reposted it. Would you like to forward-port and repost?

Thanx, Paul

> Backtrace below. Thanks!
>
> -Tyler Hall
>
> This is easier to reproduce by adding a sleep before the
> usb_bus_idr_lock, but I've seen it on an unmodified kernel.
>
> diff --git a/drivers/usb/core/devices.c b/drivers/usb/core/devices.c
> index 55dea2e7828f..534650cd0950 100644
> --- a/drivers/usb/core/devices.c
> +++ b/drivers/usb/core/devices.c
> @@ -614,6 +614,7 @@ static ssize_t usb_device_read(struct file *file,
> char __user *buf,
> if (!access_ok(VERIFY_WRITE, buf, nbytes))
> return -EFAULT;
>
> + msleep(1000);
> mutex_lock(&usb_bus_idr_lock);
> /* print devices for all busses */
> idr_for_each_entry(&usb_bus_idr, bus, id) {
>
>
> [ 24.240542] sysrq: SysRq : Show Blocked State
> [ 24.240765] task PC stack pid father
> [ 24.240975] kworker/0:2 D13840 881 2 0x80000000
> [ 24.241525] Workqueue: usb_hub_wq hub_event
> [ 24.241682] Call Trace:
> [ 24.242273] __schedule+0x317/0x6d0
> [ 24.242442] schedule+0x31/0x80
> [ 24.242514] schedule_timeout+0x1d0/0x320
> [ 24.242603] ? __queue_work+0x135/0x400
> [ 24.242689] wait_for_completion+0x92/0xf0
> [ 24.242765] ? wait_for_completion+0x92/0xf0
> [ 24.242841] ? wake_up_q+0x70/0x70
> [ 24.242907] __synchronize_srcu.part.14+0x71/0x90
> [ 24.242985] ? trace_event_raw_event_rcu_torture_read+0xe0/0xe0
> [ 24.243169] synchronize_srcu_expedited+0x22/0x30
> [ 24.243265] ? synchronize_srcu_expedited+0x22/0x30
> [ 24.243347] synchronize_srcu+0x9a/0xc0
> [ 24.243418] debugfs_remove+0x6d/0xa0
> [ 24.243490] bdi_unregister+0x8b/0x170
> [ 24.243558] del_gendisk+0x139/0x220
> [ 24.243624] sd_remove+0x5c/0xc0
> [ 24.243685] device_release_driver_internal+0x150/0x210
> [ 24.243769] device_release_driver+0xd/0x10
> [ 24.243841] bus_remove_device+0xdb/0x120
> [ 24.243915] device_del+0x1c3/0x2e0
> [ 24.243977] __scsi_remove_device+0xff/0x130
> [ 24.244122] scsi_forget_host+0x5b/0x60
> [ 24.244203] scsi_remove_host+0x74/0x140
> [ 24.244281] usb_stor_disconnect+0x54/0xc0
> [ 24.244357] usb_unbind_interface+0x6d/0x260
> [ 24.244437] device_release_driver_internal+0x150/0x210
> [ 24.244520] device_release_driver+0xd/0x10
> [ 24.244591] bus_remove_device+0xdb/0x120
> [ 24.244659] device_del+0x1c3/0x2e0
> [ 24.244722] usb_disable_device+0x97/0x1f0
> [ 24.244792] usb_disconnect+0x88/0x230
> [ 24.244853] hub_event+0x5b9/0x11e0
> [ 24.244915] ? add_timer+0x10e/0x230
> [ 24.244984] process_one_work+0x146/0x3e0
> [ 24.245124] worker_thread+0x43/0x3e0
> [ 24.245204] kthread+0x104/0x140
> [ 24.245266] ? create_worker+0x190/0x190
> [ 24.245333] ? kthread_create_on_node+0x40/0x40
> [ 24.245406] ret_from_fork+0x22/0x30
>
> [ 24.245542] cat D13712 1029 1018 0x00000000
> [ 24.245652] Call Trace:
> [ 24.245705] __schedule+0x317/0x6d0
> [ 24.245770] schedule+0x31/0x80
> [ 24.245830] schedule_preempt_disabled+0x9/0x10
> [ 24.245903] __mutex_lock.isra.2+0x225/0x470
> [ 24.245975] __mutex_lock_slowpath+0xe/0x10
> [ 24.246110] ? __mutex_lock_slowpath+0xe/0x10
> [ 24.246199] mutex_lock+0x2a/0x30
> [ 24.246261] usb_device_read+0xb6/0x140
> [ 24.246325] full_proxy_read+0x4f/0x90
> [ 24.246394] __vfs_read+0x23/0x120
> [ 24.246456] ? security_file_permission+0x96/0xb0
> [ 24.246533] ? rw_verify_area+0x49/0xb0
> [ 24.246593] vfs_read+0x8e/0x130
> [ 24.246646] SyS_read+0x41/0xa0
> [ 24.246698] entry_SYSCALL_64_fastpath+0x13/0x94
>