Re: [PATCH 1/5] dmaengine: Store module owner in dma_device struct

From: Dan Williams
Date: Fri Nov 22 2019 - 16:01:23 EST


On Fri, Nov 22, 2019 at 12:56 PM Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:
>
>
>
> On 2019-11-22 1:50 p.m., Dan Williams wrote:
> > On Fri, Nov 22, 2019 at 8:53 AM Dave Jiang <dave.jiang@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 11/21/19 10:20 PM, Vinod Koul wrote:
> >>> On 14-11-19, 10:03, Logan Gunthorpe wrote:
> >>>>
> >>>>
> >>>> On 2019-11-13 9:55 p.m., Vinod Koul wrote:
> >>>>>> But that's the problem. We can't expect our users to be "nice" and not
> >>>>>> unbind when the driver is in use. Killing the kernel if the user
> >>>>>> unexpectedly unbinds is not acceptable.
> >>>>>
> >>>>> And that is why we review the code and ensure this does not happen and
> >>>>> behaviour is as expected
> >>>>
> >>>> Yes, but the current code can kill the kernel when the driver is unbound.
> >>>>
> >>>>>>>> I suspect this is less of an issue for most devices as they wouldn't
> >>>>>>>> normally be unbound while in use (for example there's really no reason
> >>>>>>>> to ever unbind IOAT seeing it's built into the system). Though, the fact
> >>>>>>>> is, the user could unbind these devices at anytime and we don't want to
> >>>>>>>> panic if they do.
> >>>>>>>
> >>>>>>> There are many drivers which do modules so yes I am expecting unbind and
> >>>>>>> even a bind following that to work
> >>>>>>
> >>>>>> Except they will panic if they unbind while in use, so that's a
> >>>>>> questionable definition of "work".
> >>>>>
> >>>>> dmaengine core has module reference so while they are being used they
> >>>>> won't be removed (unless I complete misread the driver core behaviour)
> >>>>
> >>>> Yes, as I mentioned in my other email, holding a module reference does
> >>>> not prevent the driver from being unbound. Any driver can be unbound by
> >>>> the user at any time without the module being removed.
> >>>
> >>> That sounds okay then.
> >>
> >> I'm actually glad Logan is putting some work in addressing this. I also
> >> ran into the same issue as well dealing with unbinds on my new driver.
> >
> > This was an original mistake of the dmaengine implementation that
> > Vinod inherited. Module pinning is distinct from preventing device
> > unbind which ultimately can't be prevented. Longer term I think we
> > need to audit dmaengine consumers to make sure they are prepared for
> > the driver to be removed similar to how other request based drivers
> > can gracefully return an error status when the device goes away rather
> > than crashing.
>
> Yes, but that will be a big project because there are a lot of drivers.

Oh yes, in fact I think it's something that can only reasonably be
considered for new consumers.

> But I think the dmaengine common code needs to support removal properly,
> which essentially means changing how all the drivers allocate and free
> their structures, among other things.
>
> The one saving grace is that most of the drivers are for SOCs which
> can't be physically removed and there's really no use-case for the user
> to call unbind.

Yes, the SOC case is not so much my concern as the generic offload use
cases, especially if those offloads are in a similar hotplug domain as
a cpu.