RE: System reboot hangs due to race against devices_kset->listtriggered by SCSI FC workqueue

From: Hugh Daschbach
Date: Thu Mar 04 2010 - 17:33:12 EST


Alan Stern [mailto:stern@xxxxxxxxxxxxxxxxxxx] writes:

> On Thu, 4 Mar 2010, Hugh Daschbach wrote:
>
>> Is there some way I can detect that devn no longer points to a valid
>> device upon return from dev->*->shutdown(dev)? Or, where else can I
>> look to better understand your suggestion?
>
> Did you read the patch in my previous message? You didn't quote it.
> It removes the devn variable, so the problem you're worried about
> cannot occur.
>
> Alan Stern

Mea culpa. I looked but did not see. I have tested your patch. It
does solve my problem. I've attached my version, with comments as you
suggested, below.

I have added get_device()/put_device() to ensure there's the device
hasn't fully disappeared before calling list_del_init(). Is this
needed? If so, there's "might_sleep()" commented out in put_device().
Do I need to release the lock before calling put_device()?

Thanks for you patience.

Hugh

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2820257..1133d7a 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1731,10 +1731,20 @@ EXPORT_SYMBOL_GPL(device_move);
*/
void device_shutdown(void)
{
- struct device *dev, *devn;
+ struct device *dev;
+
+ spin_lock(&devices_kset->list_lock);
+ /*
+ * Walk the devices list backward, shutting down each in turn.
+ * Beware that device unplug events may also start pulling
+ * devices offline, even as the system is shutting down.
+ */
+ while (!list_empty(&devices_kset->list)) {
+ dev = list_entry(devices_kset->list.prev, struct device,
+ kobj.entry);
+ get_device(dev);
+ spin_unlock(&devices_kset->list_lock);

- list_for_each_entry_safe_reverse(dev, devn, &devices_kset->list,
- kobj.entry) {
if (dev->bus && dev->bus->shutdown) {
dev_dbg(dev, "shutdown\n");
dev->bus->shutdown(dev);
@@ -1742,6 +1752,15 @@ void device_shutdown(void)
dev_dbg(dev, "shutdown\n");
dev->driver->shutdown(dev);
}
+
+ /*
+ * Make sure the device iss off the kset list, in the
+ * event that dev->*->shutdown() didn't remove it.
+ */
+ spin_lock(&devices_kset->list_lock);
+ list_del_init(&dev->kobj.entry);
+ put_device(dev);
}
+ spin_unlock(&devices_kset->list_lock);
async_synchronize_full();
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/