Re: [PATCH 1/5] dmaengine: Store module owner in dma_device struct

From: Logan Gunthorpe
Date: Fri Nov 22 2019 - 15:56:20 EST




On 2019-11-22 1:50 p.m., Dan Williams wrote:
> On Fri, Nov 22, 2019 at 8:53 AM Dave Jiang <dave.jiang@xxxxxxxxx> wrote:
>>
>>
>>
>> On 11/21/19 10:20 PM, Vinod Koul wrote:
>>> On 14-11-19, 10:03, Logan Gunthorpe wrote:
>>>>
>>>>
>>>> On 2019-11-13 9:55 p.m., Vinod Koul wrote:
>>>>>> But that's the problem. We can't expect our users to be "nice" and not
>>>>>> unbind when the driver is in use. Killing the kernel if the user
>>>>>> unexpectedly unbinds is not acceptable.
>>>>>
>>>>> And that is why we review the code and ensure this does not happen and
>>>>> behaviour is as expected
>>>>
>>>> Yes, but the current code can kill the kernel when the driver is unbound.
>>>>
>>>>>>>> I suspect this is less of an issue for most devices as they wouldn't
>>>>>>>> normally be unbound while in use (for example there's really no reason
>>>>>>>> to ever unbind IOAT seeing it's built into the system). Though, the fact
>>>>>>>> is, the user could unbind these devices at anytime and we don't want to
>>>>>>>> panic if they do.
>>>>>>>
>>>>>>> There are many drivers which do modules so yes I am expecting unbind and
>>>>>>> even a bind following that to work
>>>>>>
>>>>>> Except they will panic if they unbind while in use, so that's a
>>>>>> questionable definition of "work".
>>>>>
>>>>> dmaengine core has module reference so while they are being used they
>>>>> won't be removed (unless I complete misread the driver core behaviour)
>>>>
>>>> Yes, as I mentioned in my other email, holding a module reference does
>>>> not prevent the driver from being unbound. Any driver can be unbound by
>>>> the user at any time without the module being removed.
>>>
>>> That sounds okay then.
>>
>> I'm actually glad Logan is putting some work in addressing this. I also
>> ran into the same issue as well dealing with unbinds on my new driver.
>
> This was an original mistake of the dmaengine implementation that
> Vinod inherited. Module pinning is distinct from preventing device
> unbind which ultimately can't be prevented. Longer term I think we
> need to audit dmaengine consumers to make sure they are prepared for
> the driver to be removed similar to how other request based drivers
> can gracefully return an error status when the device goes away rather
> than crashing.

Yes, but that will be a big project because there are a lot of drivers.
But I think the dmaengine common code needs to support removal properly,
which essentially means changing how all the drivers allocate and free
their structures, among other things.

The one saving grace is that most of the drivers are for SOCs which
can't be physically removed and there's really no use-case for the user
to call unbind.

Logan