Re: [PATCH 1/6] mmput: use notifier chain to call subsystem exit handler.

From: David Woodhouse
Date: Sun Oct 11 2015 - 15:03:55 EST


On Tue, 2014-07-08 at 13:03 -0400, Jerome Glisse wrote:
>
> Now regarding the device side, if we were to cleanup inside the file release
> callback than we would be broken in front of fork. Imagine the following :
> - process A open device file and mirror its address space (hmm or kfd)
> through a device file.
> - process A forks itself (child is B) while having the device file open.
> - process A quits
>
> Now the file release will not be call until child B exit which might infinite.
> Thus we would be leaking memory. As we already pointed out we can not free the
> resources from the mmu_notifier >release callback.

So if your library just registers a pthread_atfork() handler to close
the file descriptor in the child, that problem goes away? Like any
other "if we continue to hold file descriptors open when we should
close them, resources don't get freed" problem?

Perhaps we could even handled that automatically in the kernel, with
something like an O_CLOFORK flag on the file descriptor. Although
that's a little horrid.

You've argued that the amdkfd code is special and not just a device
driver. I'm not going to contradict you there, but now we *are* going
to see device drivers doing this kind of thing. And we definitely
*don't* want to be calling back into device driver code from the
mmu_notifier's .release() function.

I think amdkfd absolutely is *not* a useful precedent for normal device
drivers, and we *don't* want to follow this model in the general case.

As we try to put together a generic API for device access to processes'
address space, I definitely think we want to stick with the model that
we take a reference on the mm, and we *keep* it until the device driver
unbinds from the mm (because its file descriptor is closed, or
whatever).

Perhaps you can keep a back door into the AMD IOMMU code to continue to
do what you're doing, or perhaps the explicit management of off-cpu
tasks that is being posited will give you a sane cleanup path that
*doesn't* involve the IOMMU's mmu_notifier calling back to you. But
either way, I *really* don't want this to be the way it works for
device drivers.


> One hacky way to do it would be to schedule some delayed job from
> >release callback but then again we would have no way to properly
> synchronize ourself with other mm destruction code ie the delayed job
> could run concurently with other mm destruction code and interfer
> badly.

With the RCU-based free of the struct containing the notifier, your
'schedule some delayed job' is basically what we have now, isn't it?

I note that you also have your *own* notifier which does other things,
and has to cope with being called before or after the IOMMU's notifier.

Seriously, we don't want device drivers having to do this. We really
need to keep it simple.

--
David Woodhouse Open Source Technology Centre
David.Woodhouse@xxxxxxxxx Intel Corporation

Attachment: smime.p7s
Description: S/MIME cryptographic signature