Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from ib_umem_get
From: Shawn Bohrer
Date: Thu Aug 28 2014 - 17:52:06 EST
On Thu, Aug 28, 2014 at 02:48:19PM +0300, Haggai Eran wrote:
> On 26/08/2014 00:07, Shawn Bohrer wrote:
> >>>> The following patch fixes the issue by storing the mm_struct of the
> >> >
> >> > You are doing more than just storing the mm_struct - you are taking
> >> > a reference to the process' mm. This can lead to a massive resource
> >> > leakage. The reason is bit complex: The destruction flow for IB
> >> > uverbs is based upon releasing the file handle for it. Once the file
> >> > handle is released, all MRs, QPs, CQs, PDs, etc. that the process
> >> > allocated are released. For the kernel to release the file handle,
> >> > the kernel reference count to it needs to reach zero. Most IB
> >> > implementations expose some hardware registers to the application by
> >> > allowing it to mmap the uverbs device file. This mmap takes a
> >> > reference to uverbs device file handle that the application opened.
> >> > This reference is dropped when the process mm is released during the
> >> > process destruction. Your code takes a reference to the mm that
> >> > will only be released when the parent MR/QP is released.
> >> >
> >> > Now, we have a deadlock - the mm is waiting for the MR to be
> >> > destroyed, the MR is waiting for the file handle to be destroyed,
> >> > and the file handle is waiting for the mm to be destroyed.
> >> >
> >> > The proper solution is to keep a reference to the task_pid (using
> >> > get_task_pid), and use this pid to get the task_struct and from it
> >> > the mm_struct during the destruction flow.
> >
> > I'll put together a patch using get_task_pid() and see if I can
> > test/reproduce the issue. This may take a couple of days since we
> > have to test this in production at the moment.
> >
>
> Hi,
>
> I just wanted to point out that while working on the on demand paging patches
> we also needed to keep a reference to the task pid (to make sure we always
> handle page faults on behalf of the correct mm struct). You can find the
> relevant code in the patch titled "IB/core: Add support for on demand paging
> regions" [1].
>
> Haggai
>
> [1] https://www.mail-archive.com/linux-rdma@xxxxxxxxxxxxxxx/msg20552.html
Haggai,
I looked over the on demand paging patch and I'm not sure if you are
suggesting that it already fixes my issue, or that I should use it as
a reference for my code. In any case I just sent a v2 of a patch that
appears to fix my issue.
--
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/