Re: [PATCH 2/3] vfio-pci: Fault mmaps to enable vma tracking

From: Jason Gunthorpe
Date: Fri May 01 2020 - 19:25:56 EST


On Fri, May 01, 2020 at 03:39:19PM -0600, Alex Williamson wrote:
> Rather than calling remap_pfn_range() when a region is mmap'd, setup
> a vm_ops handler to support dynamic faulting of the range on access.
> This allows us to manage a list of vmas actively mapping the area that
> we can later use to invalidate those mappings. The open callback
> invalidates the vma range so that all tracking is inserted in the
> fault handler and removed in the close handler.
>
> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> ---
> drivers/vfio/pci/vfio_pci.c | 76 ++++++++++++++++++++++++++++++++++-
> drivers/vfio/pci/vfio_pci_private.h | 7 +++
> 2 files changed, 81 insertions(+), 2 deletions(-)

> +static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> +{
> + struct vm_area_struct *vma = vmf->vma;
> + struct vfio_pci_device *vdev = vma->vm_private_data;
> +
> + if (vfio_pci_add_vma(vdev, vma))
> + return VM_FAULT_OOM;
> +
> + if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> + vma->vm_end - vma->vm_start, vma->vm_page_prot))
> + return VM_FAULT_SIGBUS;
> +
> + return VM_FAULT_NOPAGE;
> +}
> +
> +static const struct vm_operations_struct vfio_pci_mmap_ops = {
> + .open = vfio_pci_mmap_open,
> + .close = vfio_pci_mmap_close,
> + .fault = vfio_pci_mmap_fault,
> +};
> +
> static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> {
> struct vfio_pci_device *vdev = device_data;
> @@ -1357,8 +1421,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;
>
> - return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
> - req_len, vma->vm_page_prot);
> + /*
> + * See remap_pfn_range(), called from vfio_pci_fault() but we can't
> + * change vm_flags within the fault handler. Set them now.
> + */
> + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
> + vma->vm_ops = &vfio_pci_mmap_ops;

Perhaps do the vfio_pci_add_vma & remap_pfn_range combo here if the
BAR is activated ? That way a fully populated BAR is presented in the
common case and avoids taking a fault path?

But it does seem OK as is

Jason