Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

From: Dave Chinner
Date: Mon Aug 19 2019 - 21:13:27 EST


On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote:
> On Mon, Aug 19, 2019 at 07:24:09PM +1000, Dave Chinner wrote:
>
> > So that leaves just the normal close() syscall exit case, where the
> > application has full control of the order in which resources are
> > released. We've already established that we can block in this
> > context. Blocking in an interruptible state will allow fatal signal
> > delivery to wake us, and then we fall into the
> > fatal_signal_pending() case if we get a SIGKILL while blocking.
>
> The major problem with RDMA is that it doesn't always wait on close() for the
> MR holding the page pins to be destoyed. This is done to avoid a
> deadlock of the form:
>
> uverbs_destroy_ufile_hw()
> mutex_lock()
> [..]
> mmput()
> exit_mmap()
> remove_vma()
> fput();
> file_operations->release()

I think this is wrong, and I'm pretty sure it's an example of why
the final __fput() call is moved out of line.

fput()
fput_many()
task_add_work(f, __fput())

and the call chain ends there.

Before the syscall returns to userspace, it then runs the __fput()
call through the task_work_run() interfaces, and hence the call
chain is just:

task_work_run
__fput
> file_operations->release()
> ib_uverbs_close()
> uverbs_destroy_ufile_hw()
> mutex_lock() <-- Deadlock

And there is no deadlock because nothing holds the mutex at this
point.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx