Re: [PATCH v9 7/8] PCI/P2PDMA: Allow userspace VMA allocations through sysfs

From: Greg Kroah-Hartman
Date: Fri Sep 02 2022 - 01:54:09 EST


On Thu, Sep 01, 2022 at 01:16:54PM -0600, Logan Gunthorpe wrote:
>
>
> On 2022-09-01 12:36, Greg Kroah-Hartman wrote:
> > On Thu, Sep 01, 2022 at 12:14:25PM -0600, Logan Gunthorpe wrote:
> >> Well we haven't plugged in a remove call into p2pdma, that would be more
> >> work and more interfaces touching the PCI code. Note: this code isn't a
> >> driver but a set of PCI helpers available to other PCI drivers.
> >> Everything that's setup is using the devm interfaces and gets torn down
> >> with the same. So I don't really see the benefit of making the change
> >> you propose.
> >
> > The issue is the classic one with the devm helpers. They do not lend
> > themselves to resource management issues that require ordering or other
> > sort of dependencies. Please do not use them here, just put in a remove
> > callback as you eventually will need it anyway, as you have a strong
> > requirement for what gets freed when, and the devm api does not provide
> > for that well.
>
> This surprises me. Can you elaborate on this classic issue?

There's long threads about it on the ksummit discuss mailing list and
other places.

> I've definitely seen uses of devm that expect the calls will be torn
> down in reverse order they are added.

Sorry, I didn't mean to imply the ordering of the devm code is
incorrect, that's fine.

It's when you have things in the devm "chain" that need to be freed in a
different order that stuff gets messy. Like irqs and clocks and other
types of resources that have "actions" associated with them.

> The existing p2pdma code will
> certainly fail quite significantly if a devm_kzalloc() releases its
> memory before the devm_memmap_pages() cleans up. There's also already an
> action that is used to cleanup before the last devm_kzalloc() call
> happens. If ordering is not guaranteed, then devm seems fairly broken
> and unusable and I'd have to drop all uses from this code and go back to
> the error prone method. Also what's the point of
> devm_add_action_or_reset() if it doesn't guarantee the ordering or the
> release?

I have never used devm_add_action_or_reset() so I can't say why it is
there. I am just pointing out that manually messing with a sysfs group
from a driver is a huge flag that something is wrong. A driver should
almost never be touching a raw kobject or calling any sysfs_* call if
all is normal, which is why I questioned this.

> But if it's that important I can make the change to these patches for v10.

Try it the way I suggest, with a remove() callback, and see if that
looks simpler and easier to follow and maintain over time.

thanks,

greg k-h