Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)

From: John Hubbard
Date: Wed Aug 28 2019 - 23:29:27 EST


On 8/28/19 7:02 PM, Ira Weiny wrote:
On Mon, Aug 26, 2019 at 03:55:10PM +1000, Dave Chinner wrote:
On Fri, Aug 23, 2019 at 10:08:36PM -0700, Ira Weiny wrote:
On Sat, Aug 24, 2019 at 10:11:24AM +1000, Dave Chinner wrote:
On Fri, Aug 23, 2019 at 09:04:29AM -0300, Jason Gunthorpe wrote:
...

Sure, that part works because the struct file is passed. It doesn't
end up with the same fd number in the other process, though.

The issue is that layout leases need to notify userspace when they
are broken by the kernel, so a lease stores the owner pid/tid in the
file->f_owner field via __f_setown(). It also keeps a struct fasync
attached to the file_lock that records the fd that the lease was
created on. When a signal needs to be sent to userspace for that
lease, we call kill_fasync() and that walks the list of fasync
structures on the lease and calls:

send_sigio(fown, fa->fa_fd, band);

And it does for every fasync struct attached to a lease. Yes, a
lease can track multiple fds, but it can only track them in a single
process context. The moment the struct file is shared with another
process, the lease is no longer capable of sending notifications to
all the lease holders.

Yes, you can change the owning process via F_SETOWNER, but that's
still only a single process context, and you can't change the fd in
the fasync list. You can add new fd to an existing lease by calling
F_SETLEASE on the new fd, but you still only have a single process
owner context for signal delivery.

As such, leases that require callbacks to userspace are currently
only valid within the process context the lease was taken in.

But for long term pins we are not requiring callbacks.


Hi Ira,

If "require callbacks to userspace" means sending SIGIO, then actually
FOLL_LONGTERM *does* require those callbacks. Because we've been, so
far, equating FOLL_LONGTERM with the vaddr_pin struct and with a lease.

What am I missing here?

thanks,
--
John Hubbard
NVIDIA