Re: [PATCH v2 6/6] vfio/type1: Use mapping page mask for pfnmaps

From: Peter Xu
Date: Tue Feb 18 2025 - 17:50:09 EST


On Tue, Feb 18, 2025 at 03:22:06PM -0700, Alex Williamson wrote:
> vfio-pci supports huge_fault for PCI MMIO BARs and will insert pud and
> pmd mappings for well aligned mappings. follow_pfnmap_start() walks the
> page table and therefore knows the page mask of the level where the
> address is found and returns this through follow_pfnmap_args.pgmask.
> Subsequent pfns from this address until the end of the mapping page are
> necessarily consecutive. Use this information to retrieve a range of
> pfnmap pfns in a single pass.
>
> With optimal mappings and alignment on systems with 1GB pud and 4KB
> page size, this reduces iterations for DMA mapping PCI BARs by a
> factor of 256K. In real world testing, the overhead of iterating
> pfns for a VM DMA mapping a 32GB PCI BAR is reduced from ~1s to
> sub-millisecond overhead.
>
> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
> ---
> drivers/vfio/vfio_iommu_type1.c | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index ce661f03f139..0ac56072af9f 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -520,7 +520,7 @@ static void vfio_batch_fini(struct vfio_batch *batch)
>
> static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm,
> unsigned long vaddr, unsigned long *pfn,
> - bool write_fault)
> + unsigned long *addr_mask, bool write_fault)
> {
> struct follow_pfnmap_args args = { .vma = vma, .address = vaddr };
> int ret;
> @@ -544,10 +544,12 @@ static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm,
> return ret;
> }
>
> - if (write_fault && !args.writable)
> + if (write_fault && !args.writable) {
> ret = -EFAULT;
> - else
> + } else {
> *pfn = args.pfn;
> + *addr_mask = args.addr_mask;
> + }
>
> follow_pfnmap_end(&args);
> return ret;
> @@ -590,15 +592,22 @@ static long vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
> vma = vma_lookup(mm, vaddr);
>
> if (vma && vma->vm_flags & VM_PFNMAP) {
> - ret = follow_fault_pfn(vma, mm, vaddr, pfn, prot & IOMMU_WRITE);
> + unsigned long addr_mask;
> +
> + ret = follow_fault_pfn(vma, mm, vaddr, pfn, &addr_mask,
> + prot & IOMMU_WRITE);
> if (ret == -EAGAIN)
> goto retry;
>
> if (!ret) {
> - if (is_invalid_reserved_pfn(*pfn))
> - ret = 1;
> - else
> + if (is_invalid_reserved_pfn(*pfn)) {
> + unsigned long epfn;
> +
> + epfn = (*pfn | (~addr_mask >> PAGE_SHIFT)) + 1;
> + ret = min_t(long, npages, epfn - *pfn);

s/long/unsigned long/?

Reviewed-by: Peter Xu <peterx@xxxxxxxxxx>

> + } else {
> ret = -EFAULT;
> + }
> }
> }
> done:
> --
> 2.48.1
>

--
Peter Xu