Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

From: Jason Gunthorpe
Date: Wed Feb 06 2019 - 18:21:36 EST

On Wed, Feb 06, 2019 at 02:44:45PM -0800, Dan Williams wrote:

> > Do they need to stick with xfs?
> Can you clarify the motivation for that question? This problem exists
> for any filesystem that implements an mmap that where the physical
> page backing the mapping is identical to the physical storage location
> for the file data.

.. and needs to dynamicaly change that mapping. Which is not really
something inherent to the general idea of a filesystem. A file system
that had *strictly static* block assignments would work fine.

Not all filesystem even implement hole punch.

Not all filesystem implement reflink.

ftruncate doesn't *have* to instantly return the free blocks to
allocation pool.

ie this is not a DAX & RDMA issue but a XFS & RDMA issue.

Replacing XFS is probably not be reasonable, but I wonder if a XFS--
operating mode could exist that had enough features removed to be

Ie turn off REFLINK. Change the semantic of ftruncate to be more like
ETXTBUSY. Turn off hole punch.

> > Are they really trying to do COW backed mappings for the RDMA
> > targets? Or do they want a COW backed FS but are perfectly happy
> > if the specific RDMA targets are *not* COW and are statically
> > allocated?
> I would expect the COW to be broken at registration time. Only ODP
> could possibly support reflink + RDMA. So I think this devolves the
> problem back to just the "what to do about truncate/punch-hole"
> problem in the specific case of non-ODP hardware combined with the
> Filesystem-DAX facility.

Usually the problem with COW is that you make a READ RDMA MR and on a
COW'd file, and some other thread breaks the COW..

This probably becomes a problem if the same process that has the MR
triggers a COW break (ie by writing to the CPU mmap). This would cause
the page to be reassigned but the MR would not be updated, which is
not what the app expects.

WRITE is simpler, once the COW is broken during GUP, the pages cannot
be COW'd again until the DMA pin is released. So new reflinks would be
blocked during the DMA pin period.

To fix READ you'd have to treat it like WRITE and break the COW at GPU.