Re: [PATCH v2 5/7] iio: inkern: copy/release available info from producer
From: Jonathan Cameron
Date: Sat Oct 12 2024 - 11:47:51 EST
On Wed, 09 Oct 2024 20:30:15 +0200
Matteo Martelli <matteomartelli3@xxxxxxxxx> wrote:
> Quoting Nuno Sá (2024-10-08 14:37:22)
> > On Tue, 2024-10-08 at 10:03 +0200, Matteo Martelli wrote:
> > > Quoting Nuno Sá (2024-10-08 09:29:14)
> > > > On Tue, 2024-10-08 at 08:47 +0200, Matteo Martelli wrote:
> > > > > Quoting Nuno Sá (2024-10-07 17:15:13)
> > > > > > On Mon, 2024-10-07 at 10:37 +0200, Matteo Martelli wrote:
> > > > > > > Consumers need to call the read_avail_release_resource after reading
> > > > > > > the
> > > > > > > available info. To call the release with info_exists locked, copy the
> > > > > > > available info from the producer and immediately call its release
> > > > > > > callback. With this change, users of iio_read_avail_channel_raw() and
> > > > > > > iio_read_avail_channel_attribute() must free the copied avail info
> > > > > > > after
> > > > > > > calling them.
> > > > > > >
> > > > > > > Signed-off-by: Matteo Martelli <matteomartelli3@xxxxxxxxx>
> > > > > > > ---
> > > > > > > drivers/iio/inkern.c | 64 +++++++++++++++++++++++++++++++++--
> > > > > > > ----
> > > > > > > -----
> > > > > > > include/linux/iio/consumer.h | 4 +--
> > > > > > > 2 files changed, 50 insertions(+), 18 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
> > > > > > > index
> > > > > > > 7f325b3ed08fae6674245312cf8f57bb151006c0..cc65ef79451e5aa2cea447e16800
> > > > > > > 7a44
> > > > > > > 7ffc0d91
> > > > > > > 100644
> > > > > > > --- a/drivers/iio/inkern.c
> > > > > > > +++ b/drivers/iio/inkern.c
> > > > > > > @@ -760,9 +760,25 @@ static int iio_channel_read_avail(struct
> > > > > > > iio_channel
> > > > > > > *chan,
> > > > > > > if (!iio_channel_has_available(chan->channel, info))
> > > > > > > return -EINVAL;
> > > > > > >
> > > > > > > - if (iio_info->read_avail)
> > > > > > > - return iio_info->read_avail(chan->indio_dev, chan-
> > > > > > > >channel,
> > > > > > > - vals, type, length, info);
> > > > > > > + if (iio_info->read_avail) {
> > > > > > > + const int *vals_tmp;
> > > > > > > + int ret;
> > > > > > > +
> > > > > > > + ret = iio_info->read_avail(chan->indio_dev, chan-
> > > > > > > >channel,
> > > > > > > + &vals_tmp, type, length,
> > > > > > > info);
> > > > > > > + if (ret < 0)
> > > > > > > + return ret;
> > > > > > > +
> > > > > > > + *vals = kmemdup_array(vals_tmp, *length, sizeof(int),
> > > > > > > GFP_KERNEL);
> > > > > > > + if (!*vals)
> > > > > > > + return -ENOMEM;
> > > > > > > +
> > > > > >
> > > > > > Not a big deal but I would likely prefer to avoid yet another copy. If
> > > > > > I'm
> > > > > > understanding things correctly, I would rather create an inkern wrapper
> > > > > > API
> > > > > > like
> > > > > > iio_channel_read_avail_release_resource() - maybe something with a
> > > > > > smaller
> > > > > > name :).
> > > > > > Hence, the lifetime of the data would be only controlled by the producer
> > > > > > of
> > > > > > it. It
> > > > > > would also produce a smaller diff (I think). I just find it a bit
> > > > > > confusing
> > > > > > that we
> > > > > > duplicate the data in here and the producer also duplicates it on the -
> > > > > > > read_avail()
> > > > > > call. Another advantage I see is that often the available data is indeed
> > > > > > const in
> > > > > > which case no kmemdup_array() is needed at all.
> > > > >
> > > > >
> > > > > If I understand correctly your suggestion you would leave the inkern
> > > > > iio_channel_read_avail() untouched, then add a new inkern wrapper,
> > > > > something
> > > > > like iio_channel_read_avail_release_resource(), that would call the
> > > > > producer's
> > > > > read_avail_release_resource(). The consumer would invoke this new wrapper
> > > > > in
> > > > > its
> > > > > own read_avail_release_resource() avoiding the additional copy. The call
> > > > > stack
> > > > > would look something like the following:
> > > > >
> > > > > iio_read_channel_info_avail() {
Here you are talking about the core code that produces a string.
But you've done that in reply to v5 which is about inkern interfaces
that don't work in strings...
> > > > > consumer->read_avail() {
> > > > > iio_read_avail_channel_raw() {
> > > > > iio_channel_read_avail() {
> > > > > producer->read_avail() {
> > > > > kmemdup_array();
> > > > > }
> > > > > }
> > > > > }
> > > > > }
> > > > >
> > > > > iio_format_list();
That's effectively making the safe copy that is handed back to the
caller. So this is fine.
> > > > >
> > > > > consumer->read_avail_release_resource() {
> > > > > iio_read_avail_channel_release_resource() {
> > > > > producer->read_avail_release_resource() {
> > > > > kfree();
> > > > > }
> > > > > }
> > > > > }
> > > > > }
> > > >
> > > > Yeah, exactly what came to mind...
I'm very confused what scope of comments here is. Maybe the easiest thing is to send the code.
> > > >
> > > > >
> > > > >
> > > > > I was going with the simpler solution you described, but my concern with
> > > > > it
> > > > > was
> > > > > that the info_exists_lock mutex would be unlocked between a
> > > > > iio_channel_read_avail()
> > > > > call and its corresponding iio_channel_read_avail_release_resource() call.
> > > > > To my understanding, this could potentially allow for the device to be
> > > > > unregistered between the two calls and result in a memleak of the avail
> > > > > buffer
> > > > > allocated by the producer.
Yes. That's why we have to free the produced copy under the exist_lock
(and take yet another copy).
> > > > >
> > > > > However, I have been trying to reproduce a similar case by adding a delay
> > > > > between the consumer->read_avail() and the
> > > > > consumer->read_avail_release_resources(), and by unbinding the driver
> > > > > during
> > > > > that delay, thus with the info_exists_lock mutex unlocked. In this case
> > > > > the
> > > > > driver is not unregistered until the iio_read_channel_info_avail()
> > > > > function
> > > > > completes, likely because of some other lock on the sysfs file after the
> > > > > call
> > > > > of
> > > > > cdev_device_del() in iio_device_unregister().
> > > > >
> > > >
> > > > Yes, you need to have some sync point at the kernfs level otherwise we could
> > > > always be handling a sysfs attr while the device is being removed under our
> > > > feet. But I'm not sure what you're trying to do... IIUC, the problem might
> > > > come
> > > > if have:
> > > >
> > > > consumer->read_avail_channel_attribute()
> > > > producer->info_lock()
> > > > producer->read_avail()
> > > > producer->kmalloc()
> > > >
> > > > ...
> > > > // producer unbound
> > > > ...
> > > > consumer->read_avail_release()
> > > > return -ENODEV;
> > > >
> > > > // producer->kmalloc() never get's freed...
> > > >
> > > > The above is your problem right? And I think it should be a valid one since
> > > > between ->read_avail_channel_attribute() and read_avail_release() there's
> > > > nothing preventing the producer from being unregistered...
> > >
> > > Yes, that's the problem.
> > >
> > > >
> > > > If I'm not missing nothing one solution would be for the producer to do
> > > > devm_kmalloc() and devm_kfree() on read_avail() and release_resources() but
> > > > at
> > > > that point I'm not sure it's better than what you have since it's odd enough
> > > > for
> > > > being missed in reviews...
If it's an allocation to keep a copy for an active consumer then that is nasty
as the lifetime is completely untidied. Effectively you are garbage collecting.
> > >
> > > I honestly didn't think of this and it would in fact prevent the
> > > additional copy. But I agree that it could be missed in new drivers,
> > > maybe a comment in the iio_info read_avail_release_resource() callback
> > > declaration would help?
> > > >
> > At this point I would say whatever you or Jonathan prefer :)
> >
>
> I run some quick tests with this approach and haven't found any issue so
> far.
I can't see what is preventing the race you describe where the
release callback is swept out by a driver unbind of the provider.
So unless we can show that is safe I don't see a way to avoid a consumer copy.
Long shot, Lars-Peter I think you fixed up most of the previous races in this
code, don't suppose you remember how it works?
> I would personally switch to this approach as it would be much
> simpler and easier to understand, and since the avail lists are const
> for most of the current drivers I would not expect many new drivers
> needing a dynamic available list. However, I will wait Jonathan feedback
> first.
The idea here is almost no one actually makes a copy in the producer.
The consumer copy is a necessity to my thinking because we are effectively
passing the ownership of the data. Unfortunately we have no idea how
the producer would free it so we need to create our own copy.
Key here is this a very rare operation. No one polls available
data at high frequency, it's a read once kind of thing typically.
+ these are normally really short lists in the cases we actually use
(so far I 'think' they have always been the 3 value range form, not
a list of potential values).
Jonathan
>
> About the release wrapper name: even though "release_resource" looks a
> common suffix for this kind of pattern,
> iio_read_avail_channel_release_resource() seems in fact extremely long
> and I would go for something like iio_read_avail_channel_release(). At
> that point I would also shorten the iio_info release function name for
> consistency, like read_avail_release_resource() => read_avail_release().
> I hope such names would be clear enough though. Any feedback on this?
>
> > - Nuno Sá
>
> Thanks,
> Matteo Martelli