Re: [PATCH v3 23/32] vfio: use iova_to_phys_length for efficient unmap
From: Jason Gunthorpe
Date: Thu Jun 04 2026 - 10:41:59 EST
On Wed, Jun 03, 2026 at 11:17:55PM +0800, Guanghui Feng wrote:
> Use iommu_iova_to_phys_length() to get PTE page size, allowing
> traversal by actual mapping granularity instead of PAGE_SIZE steps.
>
> Signed-off-by: Guanghui Feng <guanghuifeng@xxxxxxxxxxxxxxxxx>
> Acked-by: Shiqiang Zhang <shiyu.zsq@xxxxxxxxxxxxxxxxx>
> Acked-by: Simon Guo <wei.guo.simon@xxxxxxxxxxxxxxxxx>
> ---
> drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++++++++++++-----
> 1 file changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c8151ba54de3..115d88d7003e 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1177,25 +1177,42 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>
> iommu_iotlb_gather_init(&iotlb_gather);
> while (pos < dma->size) {
> - size_t unmapped, len;
> + size_t unmapped, len, pgsize;
> phys_addr_t phys, next;
> dma_addr_t iova = dma->iova + pos;
>
> - phys = iommu_iova_to_phys(domain->domain, iova);
> - if (WARN_ON(!phys)) {
> + /* Single page table walk returns both phys and PTE size */
> + phys = iommu_iova_to_phys_length(domain->domain, iova,
> + &pgsize);
> + if (WARN_ON(phys == PHYS_ADDR_MAX)) {
> pos += PAGE_SIZE;
> continue;
> }
> + if (WARN_ON(!pgsize || pgsize < PAGE_SIZE))
> + pgsize = PAGE_SIZE;
>
> /*
> * To optimize for fewer iommu_unmap() calls, each of which
> * may require hardware cache flushing, try to find the
> * largest contiguous physical memory chunk to unmap.
> + *
> + * mapped_length already accounts for contiguous entries
> + * from iova, then try to join following physically
> + * contiguous PTEs.
> */
> - for (len = PAGE_SIZE; pos + len < dma->size; len += PAGE_SIZE) {
> - next = iommu_iova_to_phys(domain->domain, iova + len);
> + len = min_t(size_t, pgsize, dma->size - pos);
> + for (; pos + len < dma->size; ) {
> + size_t next_pgsize;
> +
> + next = iommu_iova_to_phys_length(domain->domain,
> + iova + len,
> + &next_pgsize);
vfio should not be calling it twice, the core code needs to give the
best length as efficiently as it can. not open coding this in callers.
I think I've said this three times now
Jason