Re: [PATCH 1/5] dmaengine: Store module owner in dma_device struct

From: Dave Jiang
Date: Fri Nov 22 2019 - 16:45:18 EST




On 11/22/19 2:01 PM, Dan Williams wrote:
On Fri, Nov 22, 2019 at 12:56 PM Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote:



On 2019-11-22 1:50 p.m., Dan Williams wrote:
On Fri, Nov 22, 2019 at 8:53 AM Dave Jiang <dave.jiang@xxxxxxxxx> wrote:



On 11/21/19 10:20 PM, Vinod Koul wrote:
On 14-11-19, 10:03, Logan Gunthorpe wrote:


On 2019-11-13 9:55 p.m., Vinod Koul wrote:
But that's the problem. We can't expect our users to be "nice" and not
unbind when the driver is in use. Killing the kernel if the user
unexpectedly unbinds is not acceptable.

And that is why we review the code and ensure this does not happen and
behaviour is as expected

Yes, but the current code can kill the kernel when the driver is unbound.

I suspect this is less of an issue for most devices as they wouldn't
normally be unbound while in use (for example there's really no reason
to ever unbind IOAT seeing it's built into the system). Though, the fact
is, the user could unbind these devices at anytime and we don't want to
panic if they do.

There are many drivers which do modules so yes I am expecting unbind and
even a bind following that to work

Except they will panic if they unbind while in use, so that's a
questionable definition of "work".

dmaengine core has module reference so while they are being used they
won't be removed (unless I complete misread the driver core behaviour)

Yes, as I mentioned in my other email, holding a module reference does
not prevent the driver from being unbound. Any driver can be unbound by
the user at any time without the module being removed.

That sounds okay then.

I'm actually glad Logan is putting some work in addressing this. I also
ran into the same issue as well dealing with unbinds on my new driver.

This was an original mistake of the dmaengine implementation that
Vinod inherited. Module pinning is distinct from preventing device
unbind which ultimately can't be prevented. Longer term I think we
need to audit dmaengine consumers to make sure they are prepared for
the driver to be removed similar to how other request based drivers
can gracefully return an error status when the device goes away rather
than crashing.

Yes, but that will be a big project because there are a lot of drivers.

Oh yes, in fact I think it's something that can only reasonably be
considered for new consumers.

But I think the dmaengine common code needs to support removal properly,
which essentially means changing how all the drivers allocate and free
their structures, among other things.

The one saving grace is that most of the drivers are for SOCs which
can't be physically removed and there's really no use-case for the user
to call unbind.

Yes, the SOC case is not so much my concern as the generic offload use
cases, especially if those offloads are in a similar hotplug domain as
a cpu.


It becomes a bigger issue when "channels" can be reconfigured and can come and go in a hot plug fashion.