Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

From: Jason Gunthorpe
Date: Fri Feb 08 2019 - 00:19:56 EST

On Thu, Feb 07, 2019 at 03:54:58PM -0800, Dan Williams wrote:

> > The only production worthy way is to have the FS be a partner in
> > making this work without requiring revoke, so the critical RDMA
> > traffic can operate safely.
> ...belies a path forward. Just swap out "FS be a partner" with "system
> administrator be a partner". In other words, If the RDMA stack can't
> tolerate an MR being disabled then the administrator needs to actively
> disable the paths that would trigger it. Turn off reflink, don't
> truncate, avoid any future FS feature that might generate unwanted
> lease breaks.

This is what I suggested already, except with explicit kernel aid, not
left as some gordian riddle for the administrator to unravel.

You already said it is too hard for expert FS developers to maintain a
mode switch, it seems like a really big stretch to think application
and systems architects will have any hope to do better.

It makes much more sense for the admin to flip some kind of bit and
the FS guarentees the safety that you are asking the admin to create.

> We would need to make sure that lease notifications include the
> information to identify the lease breaker to debug escapes that
> might happen, but it is a solution that can be qualified to not
> lease break.

I think building a complicated lease framework and then telling
everyone in user space to design around it so it never gets used would
be very hard to explain and justify.

Never mind the security implications if some seemingly harmless future
filesystem change causes unexpected lease revokes across something
like a tenant boundary.

> In any event, this lets end users pick their filesystem
> (modulo RDMA incompatible features), provides an enumeration of
> lease break sources in the kernel, and opens up FS-DAX to a wider
> array of RDMA adapters. In general this is what Linux has
> historically done, give end users technology freedom.

I think this is not the Linux model. The kernel should not allow
unpriv user space to do an operation that could be unsafe.

I continue to think this is is the best idea that has come up - but
only if the filesystem is involved and expressly tells the kernel
layers that this combination of DAX & filesystem is safe.