Re: Race condition between driver_probe_device and device_shutdownâ

From: Ming Lei
Date: Sun May 20 2012 - 10:08:19 EST


Hi,

On Fri, May 18, 2012 at 12:59 PM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>> Hi,
>
> First off, sorry for missing this, and thanks to Andrew for pointing it
> out to me.  You might want to use the tool, scripts/get_maintainer.pl
> for who to know to cc: for patches like this, so I don't miss it.
>
>> I'm seeing a driver crash in its shutdown routine because it's
>> touching some uninitialized state. It turns out that the driver's
>> probe routine was still running [for the same device]. There also
>> appears to be an issue in the remove path, where device_shutdown()
>> checks the dev->driver pointer and uses it later, with seemingly
>> nothing to guarantee that it doesn't change.
>
> What type of driver is having this problem?  What type of bus is it on?
> Usually the bus prevents this from happening with its own serialization.

Looks it is a generic problem.

There are two races, one is between .probe and .shutdown, and another
is between .release and .shutdown, see below:


void device_shutdown(void)

......
/* Don't allow any more runtime suspends */
pm_runtime_get_noresume(dev);
pm_runtime_barrier(dev);

if (dev->bus && dev->bus->shutdown) {
dev_dbg(dev, "shutdown\n");
dev->bus->shutdown(dev);
} else if (dev->driver && dev->driver->shutdown) { /*line-driver*/
dev_dbg(dev, "shutdown\n");
dev->driver->shutdown(dev); /*line-shut*/
}
......

If dev->driver is just set(really_probe) before 'line-driver' and .probe is
not executed before 'line-shut', the .shutdown may touch a uninitialized
device.

Also if dev->driver is just cleared(__device_release_driver) after "line-driver"
and before "line-shut", null pointer will be referenced and oops will
be triggered.


>> Shouldn't we synchronize the shutdown routine with probe/remove to
>> prevent such races?
>
> Normally, yes, and for some reason, I thought we already were doing
> that.

Looks the races are still there.

>> The patch below should take care of these races.
>
> Does this patch solve your problem?  Care to show me the oops you get
> without it?
>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index e28ce98..f2c63c6 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -1823,6 +1823,9 @@ void device_shutdown(void)
>>                 pm_runtime_get_noresume(dev);
>>                 pm_runtime_barrier(dev);
>>
>> +               if (dev->parent)        /* Needed for USB */
>> +                       device_lock(dev->parent);
>> +               device_lock(dev);

Looks the above makes sense to serialize .shutdown with
.probe and .release.


Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/