Re: [PATCH] driver core: fix shutdown races with probe/remove
From: Alan Stern
Date: Tue Jun 05 2012 - 10:47:03 EST
On Tue, 5 Jun 2012, Ming Lei wrote:
> Firstly, .shutdown callback may touch a uninitialized hardware
> if dev->driver is set and .probe is not completed.
>
> Secondly, device_shutdown() may dereference a null pointer to cause
> oops when dev->driver is cleared after it is checked in device_shutdown().
>
> This patch uses SRCU and ACCESS_ONCE to avoid the races:
> - don't start .shutdown until .probe in progess is completed
> - avoid running .probe and .remove once shutdown is started
Avoid running probe, that's fine. But avoiding remove can lead to
problems, because the subsystem and the driver will no longer agree on
who should manage the device.
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -49,6 +49,9 @@ static struct kobject *dev_kobj;
> struct kobject *sysfs_dev_char_kobj;
> struct kobject *sysfs_dev_block_kobj;
>
> +struct srcu_struct driver_srcu;
> +int device_shutdown_started;
> +
> #ifdef CONFIG_BLOCK
> static inline int device_is_not_partition(struct device *dev)
> {
> @@ -1803,6 +1806,9 @@ void device_shutdown(void)
> {
> struct device *dev;
>
> + ACCESS_ONCE(device_shutdown_started) = 1;
This does not need to be ACCESS_ONCE. ACCESS_ONCE is needed only in
situations where the compiler might insert multiple reads or writes.
> + synchronize_srcu(&driver_srcu);
> +
> spin_lock(&devices_kset->list_lock);
> /*
> * Walk the devices list backward, shutting down each in turn.
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 1b1cbb5..5587037 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -250,8 +250,13 @@ static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
> static int really_probe(struct device *dev, struct device_driver *drv)
> {
> int ret = 0;
> + int idx;
>
> + idx = srcu_read_lock(&driver_srcu);
> atomic_inc(&probe_count);
> + if (ACCESS_ONCE(device_shutdown_started))
This doesn't need to be ACCESS_ONCE either. If it would make you feel
better, you could write
if (rcu_dereference(&device_shutdown_started))
> + goto done;
You need to set ret to something other than 0, because the probe has
not succeeded. For example, set it to -ESHUTDOWN.
> +
> pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
> drv->bus->name, __func__, drv->name, dev_name(dev));
> WARN_ON(!list_empty(&dev->devres_head));
> @@ -305,6 +310,7 @@ probe_failed:
> done:
> atomic_dec(&probe_count);
> wake_up(&probe_waitqueue);
> + srcu_read_unlock(&driver_srcu, idx);
> return ret;
> }
>
> @@ -467,6 +473,12 @@ EXPORT_SYMBOL_GPL(driver_attach);
> static void __device_release_driver(struct device *dev)
> {
> struct device_driver *drv;
> + int idx;
> +
> + idx = srcu_read_lock(&driver_srcu);
> +
> + if (ACCESS_ONCE(device_shutdown_started))
> + goto exit;
As mentioned above, this is not good. I don't know... maybe it won't
ever cause any difficulties.
Alan Stern
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/