Re: [regression?] Re: [PATCH v6 06/12] mm/gup: track FOLL_PIN pages

From: Jason Gunthorpe
Date: Tue Apr 28 2020 - 15:22:56 EST


On Tue, Apr 28, 2020 at 01:07:52PM -0600, Alex Williamson wrote:
> On Tue, 28 Apr 2020 14:49:57 -0300
> Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> > On Tue, Apr 28, 2020 at 10:54:55AM -0600, Alex Williamson wrote:
> > > static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> > > {
> > > struct vfio_pci_device *vdev = device_data;
> > > @@ -1253,8 +1323,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> > > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> > > vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
> > >
> > > + vma->vm_ops = &vfio_pci_mmap_ops;
> > > +
> > > +#if 1
> > > + return 0;
> > > +#else
> > > return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> > > - req_len, vma->vm_page_prot);
> > > + vma->vm_end - vma->vm_start, vma->vm_page_prot);
> >
> > The remap_pfn_range here is what tells get_user_pages this is a
> > non-struct page mapping:
> >
> > vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
> >
> > Which has to be set when the VMA is created, they shouldn't be
> > modified during fault.
>
> Aha, thanks Jason! So fundamentally, pin_user_pages_remote() should
> never have been faulting in this vma since the pages are non-struct
> page backed.

gup should not try to pin them.. I think the VM will still call fault
though, not sure from memory?

> Maybe I was just getting lucky before this commit. For a
> VM_PFNMAP, vaddr_get_pfn() only needs pin_user_pages_remote() to return
> error and the vma information that we setup in vfio_pci_mmap().

I've written on this before, vfio should not be passing pages to the
iommu that it cannot pin eg it should not touch VM_PFNMAP vma's in the
first place.

It is a use-after-free security issue the way it is..

> only need the fault handler to trigger for user access, which is what I
> see with this change. That should work for me.
>
> > Also the vma code above looked a little strange to me, if you do send
> > something like this cc me and I can look at it. I did some work like
> > this for rdma a while ago..
>
> Cool, I'll do that. I'd like to be able to zap the vmas from user
> access at a later point and I have doubts that I'm holding the
> refs/locks that I need to for that. Thanks,

Check rdma_umap_ops, it does what you described (actually it replaces
them with 0 page, but along the way it zaps too).

Jason