Re: [RFC PATCH 1/7] drivers: base: Add resource managed version of delayed work init

From: Matti Vaittinen
Date: Mon Feb 15 2021 - 02:04:12 EST



On Sat, 2021-02-13 at 14:18 +0100, Hans de Goede wrote:
> Hi,
>
> On 2/13/21 1:16 PM, Greg Kroah-Hartman wrote:
> > On Sat, Feb 13, 2021 at 01:58:44PM +0200, Matti Vaittinen wrote:
> > > +/**
> > > + * devm_delayed_work_autocancel - Resource-managed work
> > > allocation
> > > + * @dev: Device which lifetime work is bound to
> > > + * @pdata: work to be cancelled when device exits
> > > + *
> > > + * Initialize work which is automatically cancelled when device
> > > exits.
> >
> > There is no such thing in the driver model as "when device exits".
> > Please use the proper terminology as I do not understand what you
> > think
> > this is doing here...
>
> I agree that this needs better wording I always talk about driver-
> unbinding
> because sysfs has /sys/bus/*/drivers/*/bind and
> /sys/bus/*/drivers/*/unbind
> attributes. But I see that the relevant driver-core functions all
> call it
> driver detaching, so lets be consistent and use that here too.

//Snip

> > > @@ -249,6 +250,10 @@ void __iomem *devm_of_iomap(struct device
> > > *dev,
> > > struct device_node *node, int index,
> > > resource_size_t *size);
> > >
> > > +/* delayed work which is cancelled when driver exits */
> >
> > Not when the "driver exits".
>
> Right this should be detached not exits.
>

Thanks guys.
I am poor with the terminology so I do appreciate your help in getting
this right. I can change this for the v2.


> > There is two different lifespans here (well 3). Code and
> > data*2. Don't
> > confuse them as that will just cause lots of problems.
> >
> > The move toward more and more "devm" functions is not the way to go
> > as
> > they just more and more make things easier to get wrong.
> >
> > APIs should be impossible to get wrong, this one is going to be
> > almost
> > impossible to get right.
>
> I have to disagree here devm generally makes it easier to get things
> right,
> it is when some devm functions are missing and devm and non devm
> resources
> are mixed that things get tricky.

Thanks for the discussion. I hope we can come to some conclusion here.
Unsurprisingly I agree with Hans here. I did after all send this patch
series :) I guess I am mostly just repeating what he said.

As Hans pointed out, when all calls are 'undone' by devm the order of
'undoing' is highly likely to be correct as the unwinding is done in
reverse order to initializations. I think it is sane to assume in most
case things are initiated in order where operations which depend on
something are done last - and when 'unwinding' things those are
'undone' first.

My 'gut feeling' for probe / remove related errors is that the most
usual errors I've seen have been:

a) Tear-down completely forgotten
b) Tear-down forgotten at error path
c) Wrong order of initiating things (IRQ requested prior resource
initialization)
d) Wrong order of cleann-up at remove.

a) and b) class errors have been the most common ones I've seen. They
can be completely avoided when devm is used.
c) is there no matter if we use devm or not.
d) is mostly avoided when only devm is used - mixing devm and manual
operations make this more likely as Hans pointed out. As long as we
have some devm operations we should help avoid mixing devm and manual
clean-up.

Best Regards
Matti Vaittinen