Re: [PATCH 2/2] gpiolib: cdev: release IRQs when the gpio chip device is removed

From: Bartosz Golaszewski
Date: Thu Feb 22 2024 - 03:31:33 EST


On Thu, 22 Feb 2024 02:05:30 +0100, Kent Gibson <warthog618@xxxxxxxxx> said:
> On Thu, Feb 22, 2024 at 08:57:44AM +0800, Kent Gibson wrote:
>> On Tue, Feb 20, 2024 at 10:29:59PM +0800, Kent Gibson wrote:
>> > On Tue, Feb 20, 2024 at 12:10:18PM +0100, Herve Codina wrote:
>>
>> ...
>>
>> > > }
>> > >
>> > > +static int linereq_unregistered_notify(struct notifier_block *nb,
>> > > + unsigned long action, void *data)
>> > > +{
>> > > + struct linereq *lr = container_of(nb, struct linereq,
>> > > + device_unregistered_nb);
>> > > + int i;
>> > > +
>> > > + for (i = 0; i < lr->num_lines; i++) {
>> > > + if (lr->lines[i].desc)
>> > > + edge_detector_stop(&lr->lines[i]);
>> > > + }
>> > > +
>> >
>> > Firstly, the re-ordering in the previous patch creates a race,
>> > as the NULLing of the gdev->chip serves to numb the cdev ioctls, so
>> > there is now a window between the notifier being called and that numbing,
>> > during which userspace may call linereq_set_config() and re-request
>> > the irq.
>> >
>> > There is also a race here with linereq_set_config(). That can be prevented
>> > by holding the lr->config_mutex - assuming the notifier is not being called
>> > from atomic context.
>> >
>>
>> It occurs to me that the fixed reordering in patch 1 would place
>> the notifier call AFTER the NULLing of the ioctls, so there will no longer
>> be any chance of a race with linereq_set_config() - so holding the
>> config_mutex semaphore is not necessary.
>>
>
> NULLing -> numbing
>
> The gdev->chip is NULLed, so the ioctls are numbed.
> And I need to let the coffee soak in before sending.
>
>> In which case this patch is fine - it is only patch 1 that requires
>> updating.
>>
>> Cheers,
>> Kent.
>

The fix for the user-space issue may be more-or-less correct but the problem is
deeper and this won't fix it for in-kernel users.

Herve: please consider the following DT snippet:

gpio0 {
compatible = "foo";

gpio-controller;
#gpio-cells = <2>;
interrupt-controller;
#interrupt-cells = <1>;
ngpios = <8>;
};

consumer {
compatible = "bar";

interrupts-extended = <&gpio0 0>;
};

If you unbind the "gpio0" device after the consumer requested the interrupt,
you'll get the same splat. And device links will not help you here (on that
note: Saravana: is there anything we could do about it? Have you even
considered making the irqchip subsystem use the driver model in any way? Is it
even feasible?).

I would prefer this to be fixed at a lower lever than the GPIOLIB character
device.

Bartosz