Re: [driver-core PATCH v7 2/9] driver core: Establish clear order of operations for deferred probe and remove

From: Dan Williams
Date: Thu Nov 29 2018 - 17:00:29 EST


On Thu, Nov 29, 2018 at 1:54 PM Alexander Duyck
<alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
>
> On Thu, 2018-11-29 at 10:55 -0800, Dan Williams wrote:
> > On Thu, Nov 29, 2018 at 10:07 AM Alexander Duyck
> > <alexander.h.duyck@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, 2018-11-28 at 17:57 -0800, Dan Williams wrote:
> >
> > [..]
> > > > I think the flag should be named "cancel" and set it in the
> > > > device_del() path. Otherwise this is encoding code flow state in the
> > > > struct rather than device-state that the code needs to comprehend.
> > >
> > > Instead of "cancel" what would you think of "dead"? In my mind once we
> > > call device_del we are essentially working with a dead device object so
> > > that might make more sense in terms of a state rather than "cancel"
> > > which doesn't really tell us what should be canceled.
> >
> > That sounds good to me.
> >
> > > Looking over the code I could probably set it before we start calling
> > > the notifiers for BUS_NOTIFY_DEL_DEVICE. The only thing I am not sure
> > > about is if we would need to add any sort of synchronization primitives
> > > around it.
> > >
> >
> > I think it needs to be something like a barrier:
> >
> > dev->dead;
> > device_lock();
> > device_unlock();
> >
> > ...where you can be sure that anyone after that device_unlock() has
> > acted on dev->dead, or will see dev->dead.
>
> Actually what I think I will do is set dev->dead to true with the
> device lock held at the start of device_del. So something like:
> device_lock();
> dev->dead = true;
> device_unlock();

Yeah, the lock is needed anyway since it's a bitfield.

> It seems like that would probably make the most sense and be consistent
> with the handling of other bits such as dev->offline. It means adding
> one more call to device_lock/unlock but it guarantees that the update
> behavior is consistent with the other bitfields near it and that we
> cannot have an asynchronous probe routine trying to run while we are
> tearing out the bus/class/sysfs from underneath the device.

I think that last statement

"guarantee that the update behavior is consistent with the other
bitfields near it and that we cannot have an asynchronous probe
routine trying to run while we are tearing out the bus/class/sysfs
from underneath the device."

...would be good to include as a comment when setting 'dead' to true.