Re: [PATCH] ib_umem_release should decrement mm->pinned_vm from ib_umem_get

From: Haggai Eran
Date: Thu Aug 28 2014 - 07:49:13 EST

On 26/08/2014 00:07, Shawn Bohrer wrote:
>>>> The following patch fixes the issue by storing the mm_struct of the
>> >
>> > You are doing more than just storing the mm_struct - you are taking
>> > a reference to the process' mm. This can lead to a massive resource
>> > leakage. The reason is bit complex: The destruction flow for IB
>> > uverbs is based upon releasing the file handle for it. Once the file
>> > handle is released, all MRs, QPs, CQs, PDs, etc. that the process
>> > allocated are released. For the kernel to release the file handle,
>> > the kernel reference count to it needs to reach zero. Most IB
>> > implementations expose some hardware registers to the application by
>> > allowing it to mmap the uverbs device file. This mmap takes a
>> > reference to uverbs device file handle that the application opened.
>> > This reference is dropped when the process mm is released during the
>> > process destruction. Your code takes a reference to the mm that
>> > will only be released when the parent MR/QP is released.
>> >
>> > Now, we have a deadlock - the mm is waiting for the MR to be
>> > destroyed, the MR is waiting for the file handle to be destroyed,
>> > and the file handle is waiting for the mm to be destroyed.
>> >
>> > The proper solution is to keep a reference to the task_pid (using
>> > get_task_pid), and use this pid to get the task_struct and from it
>> > the mm_struct during the destruction flow.
> I'll put together a patch using get_task_pid() and see if I can
> test/reproduce the issue. This may take a couple of days since we
> have to test this in production at the moment.


I just wanted to point out that while working on the on demand paging patches
we also needed to keep a reference to the task pid (to make sure we always
handle page faults on behalf of the correct mm struct). You can find the
relevant code in the patch titled "IB/core: Add support for on demand paging
regions" [1].


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at