Re: [driver-core PATCH v8 2/9] driver core: Establish order of operations for device_add and device_del via bitflag

From: Dan Williams
Date: Mon Dec 10 2018 - 14:43:33 EST


On Mon, Dec 10, 2018 at 11:35 AM Alexander Duyck
<alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, 2018-12-10 at 10:58 -0800, Dan Williams wrote:
> > On Wed, Dec 5, 2018 at 9:25 AM Alexander Duyck
> > <alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
> > >
> > > Add an additional bit flag to the device struct named "dead".
> > >
> > > This additional flag provides a guarantee that when a device_del is
> > > executed on a given interface an async worker will not attempt to attach
> > > the driver following the earlier device_del call. Previously this
> > > guarantee was not present and could result in the device_del call
> > > attempting to remove a driver from an interface only to have the async
> > > worker attempt to probe the driver later when it finally completes the
> > > asynchronous probe call.
> > >
> > > Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
> > > ---
> > > drivers/base/core.c | 11 +++++++++++
> > > drivers/base/dd.c | 8 ++++++--
> > > include/linux/device.h | 5 +++++
> > > 3 files changed, 22 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index f3e6ca4170b4..70358327303b 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -2075,6 +2075,17 @@ void device_del(struct device *dev)
> > > struct kobject *glue_dir = NULL;
> > > struct class_interface *class_intf;
> > >
> > > + /*
> > > + * Hold the device lock and set the "dead" flag to guarantee that
> > > + * the update behavior is consistent with the other bitfields near
> > > + * it and that we cannot have an asynchronous probe routine trying
> > > + * to run while we are tearing out the bus/class/sysfs from
> > > + * underneath the device.
> > > + */
> > > + device_lock(dev);
> > > + dev->dead = true;
> > > + device_unlock(dev);
> > > +
> > > /* Notify clients of device removal. This call must come
> > > * before dpm_sysfs_remove().
> > > */
> > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > index 88713f182086..3bb8c3e0f3da 100644
> > > --- a/drivers/base/dd.c
> > > +++ b/drivers/base/dd.c
> > > @@ -774,6 +774,10 @@ static void __device_attach_async_helper(void *_dev, async_cookie_t cookie)
> > >
> > > device_lock(dev);
> > >
> > > + /* device is or has been removed from the bus, just bail out */
> > > + if (dev->dead)
> > > + goto out_unlock;
> > > +
> >
> > What do you think about moving this check into
> > __device_attach_driver() alongside all the other checks? That way we
> > also get ->dead checking through the __device_attach() path.
>
> I'm not really sure that is the best spot to do that. Part of the
> reason being that by placing it where I did we avoid messing with the
> runtime power management for the parent if it was already powered off.

...but this already a rare event and the parent shouldn't otherwise be
bothered by a spurious pm_runtime wakeup event.

> If anything I would say we could probably look at pulling the check out
> and placing the driver check in __device_attach_async_helper since from
> what I can tell the check is actually redundant in the non-async path
> anyway since __device_attach already had taken the device lock and
> checked dev->driver prior to calling __device_attach_driver.
>
> > ...and after that maybe it could be made a common helper
> > (dev_driver_checks()?) shared between __device_attach_driver() and
> > __driver_attach() to reduce some duplication.
>
> I'm not sure consolidating it into a function would really be worth the
> extra effort. It would essentially just obfuscate the checks and I am
> not sure you really save much with:
> if (dev_driver_checks(dev))
> vs:
> if (!dev->dead && !dev->driver)
>
> By the time you create the function and replace the few spots that are
> making these checks you would end up most likely adding more complexity
> to the kernel rather than reducing it any.

No, I was talking about removing this duplication in
__device_attach_driver() and __driver_attach():

if (ret == 0) {
/* no match */
return 0;
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
} /* ret > 0 means positive match */

...and lead in with a dev->dead check.