Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

From: Daniel Vetter
Date: Mon Oct 05 2020 - 14:54:29 EST


On Mon, Oct 5, 2020 at 8:37 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Mon, Oct 05, 2020 at 08:16:33PM +0200, Daniel Vetter wrote:
>
> > > kvm is some similar hack added for P2P DMA, see commit
> > > add6a0cd1c5ba51b201e1361b05a5df817083618. It might be protected by notifiers..
> >
> > Yeah my thinking is that kvm (and I think also vfio, also seems to
> > have mmu notifier nearby) are ok because of the mmu notiifer. Assuming
> > that one works correctly.
>
> vfio doesn't have a notifier, Alex was looking to add a vfio private
> scheme in the vma->private_data:
>
> https://lore.kernel.org/kvm/159017449210.18853.15037950701494323009.stgit@xxxxxxxxxx/
>
> Guess it never happened.

I was mislead by the mmu notifier in drivers/vfio/vfio.c. But looking
closer, that's only used by some drivers, I guess to make sure their
device pagetables are kept in sync with reality. And not to make sure
the vfio pfn view is kept in sync with reality.

This could get real nasty I think.

> > > So, the answer really is that s390 and media need fixing, and this API
> > > should go away (or become kvm specific)
> >
> > I'm still not clear how you want fo fix this, since your vma->dma_buf
> > idea is kinda a decade long plan and so just not going to happen:
>
> Well, it doesn't mean we have to change every part of dma_buf to
> participate in this. Just the bits media cares about. Or maybe it is
> some higher level varient on top of dma_buf.
>
> Or don't use dma_buf for this, add a new object that just provides
> refcounts and P2P DMA connection for IO pfn ranges..

So good news is, I dug some layers deeper in v4l, and there's only 2
users which do actually handle pfn and don't immediately convert to a
pages array:
- videbuf-dma-contig.c. Luckily videobuf 1 is deprecated since
forever, so I think we might get away with either just breaking this,
or at least tainting kernels and hiding it behind a nasty Kconfig.
This only uses follow_pfn, which we need to keep anyway for vfio in
the unsafe variant :-/
- videbuf2-vmalloc.c Digging through history this was added to support
import of v4l buffers from drivers that needed contig memory. And way
back before CMA, that meant carveout memory not backed by struct page
*. That should now all have struct pages and be managed by CMA (since
videbuf2-dma-contig.c just uses dma_alloc_coherent underneath), so I
think we can just switch to pin_user_pages(FOLL_LONGTERM here too).

iow I think I can outright delete the frame vector stuff.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch