Re: [PATCH v3 06/11] iio: inkern: add module put/get on iio dev module when requesting channels

From: Jonathan Cameron
Date: Wed Apr 18 2018 - 05:37:34 EST


On Tue, 17 Apr 2018 12:19:06 -0700
Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx> wrote:

> Hi Eugen,
>
> On Tue, Apr 17, 2018 at 10:39:24AM +0300, Eugen Hristev wrote:
> >
> >
> > On 17.04.2018 02:58, Dmitry Torokhov wrote:
> > > On Sun, Apr 15, 2018 at 08:33:21PM +0100, Jonathan Cameron wrote:
> > > > On Tue, 10 Apr 2018 11:57:52 +0300
> > > > Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx> wrote:
> > > >
> > > > > When requesting channels for a particular consumer device,
> > > > > besides requesting the device (incrementing the reference counter), also
> > > > > do it for the driver module of the iio dev. This will avoid the situation
> > > > > where the producer IIO device can be removed and the consumer is still
> > > > > present in the kernel.
> > > > >
> > > > > Signed-off-by: Eugen Hristev <eugen.hristev@xxxxxxxxxxxxx>
> > > > > ---
> > > > > drivers/iio/inkern.c | 8 +++++++-
> > > > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
> > > > > index ec98790..68d9b87 100644
> > > > > --- a/drivers/iio/inkern.c
> > > > > +++ b/drivers/iio/inkern.c
> > > > > @@ -11,6 +11,7 @@
> > > > > #include <linux/slab.h>
> > > > > #include <linux/mutex.h>
> > > > > #include <linux/of.h>
> > > > > +#include <linux/module.h>
> > > > > #include <linux/iio/iio.h>
> > > > > #include "iio_core.h"
> > > > > @@ -152,6 +153,7 @@ static int __of_iio_channel_get(struct iio_channel *channel,
> > > > > if (index < 0)
> > > > > goto err_put;
> > > > > channel->channel = &indio_dev->channels[index];
> > > > > + try_module_get(channel->indio_dev->driver_module);
> > > >
> > > > And if it fails? (the module we are trying to get is going away...)
> > > >
> > > > We should try and handle it I think. Be it by just erroring out of here.
> > >
> > > Even more, this has nothing to do with modules. A device can go away for
> > > any number of reasons (we unbind it manually via sysfs, we pull the USB
> > > plug from the host in case it is USB-connected device, we unload I2C
> > > adapter for the bus device resides on, we kick underlying PCI device)
> > > and we should be able to handle this in some fashion. Handling errors
> > > from reads and ignoring garbage is one of methods.
> > >
> > > FWIW this is a NACK from me.
> > >
> > > Thanks.
> > Hello,
> >
> > This patch is actually a "best effort attempt" for the consumer driver
> > (touch driver) to get a reference to the producer of the data (the IIO
> > device), when it requests the specific channels.
> > As of this moment, there is no attempt whatsoever for the consumer to have a
> > reference on the producer driver. Thus, the producer can be removed at any
> > time, and the consumer will fail ungraciously.
>
> This is the root of the issue. The consumer should be prepared to handle
> errors from producer.
>
> > I can change the perspective from "best effort" to "mandatory" to get a
> > reference to the producer, or you wish to stop trying to get any reference
> > at all (remove this patch completely) ?
>
> You should take reference to the device itself (if it is not taken
> already), so it does not disappear completely and you can continue using
> IIO API to access it, and IIO API should be prepared to deal with "dead"
> devices, but as I pointed in my other email, trying to pin the driver
> is quite pointless as there are myriad other ways of device stopping
> working besides module unloading.
>
> In any case, I think this problem is outside of the scope of this
> patchset that adds a generic resistive touchscreen, so if you want to
> continue working on this I'd recommend moving it into a separate series.
>
> Thanks.
>
Agreed, this one has come up a number of times before. Quite a lot of
work got done by (IIRC) Lars Peter Clausen to stabilize things in various
unexpected 'going away' events. Of course there may be paths we have
added since that (it was years ago) that can cause trouble...

Anyhow, separate issue as Dmitry says, let's deal with it separately.

Jonathan