Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions

From: Jason Gunthorpe
Date: Wed Dec 12 2018 - 18:37:13 EST

On Wed, Dec 12, 2018 at 04:53:49PM -0500, Jerome Glisse wrote:
> > Almost, we need some safety around assuming that DMA is complete the
> > page, so the notification would need to go all to way to userspace
> > with something like a file lease notification. It would also need to
> > be backstopped by an IOMMU in the case where the hardware does not /
> > can not stop in-flight DMA.
> You can always reprogram the hardware right away it will redirect
> any dma to the crappy page.

That causes silent data corruption for RDMA users - we can't do that.

The only way out for current hardware is to forcibly terminate the
RDMA activity somehow (and I'm not even sure this is possible, at
least it would be driver specific)

Even the IOMMU idea probably doesn't work, I doubt all current
hardware can handle a PCI-E error TLP properly.

On some hardware it probably just protects DAX by causing data
corruption for RDMA - I fail to see how that is a win for system
stability if the user obviously wants to use DAX and RDMA together...

I think your approach with ODP only is the only one that meets your
requirements, the only other data-integrity-preserving approach is to
block/fail ftruncate/etc.

> From my point of view driver should listen to ftruncate before the
> mmu notifier kicks in and send event to userspace and maybe wait
> and block ftruncate (or move it to a worker thread).

We can do this, but we can't guarantee forward progress in userspace
and the best way we have to cancel that is portable to all RDMA
hardware is to kill the process(es)..

So if that is acceptable then we could use user notifiers and allow
non-ODP users...