Re: [RFC PATCH] Optimize VFIO and IOMMU mapping traversal
From: Jason Gunthorpe
Date: Fri May 29 2026 - 07:56:15 EST
On Fri, May 29, 2026 at 03:09:32PM +0800, Guanghui Feng wrote:
> In VFIO, vfio_unmap_unpin requires performing iommu unmap and mm
> unpin on the address space. However, VFIO doesn't record the PHY
> address corresponding to iova, but instead obtains the iova-PHY
> mapping through iommu_iommu_iova_to_phys.
>
> In IOMMU, under conditions such as address alignment, it prioritizes
> mapping iova-PHY based on bigpages. Therefore, during the
> vfio_unmap_unpin process, traversal can be performed at the
> granularity of the IOMMU map, reducing the number of
> iommu_iova_to_phys queries and significantly improving conversion
> efficiency.
>
> Therefore, an iommu_iova_to_pgsize implementation is added to the
> IOMMU driver to return the pagesize used for the iova mapping.
This is the wrong API, what we need here is an iova_to_phys variation
that returns a size so physically contiguous IOPTEs can be joined.
> drivers/iommu/amd/iommu.c | 2 ++
> drivers/iommu/generic_pt/iommu_pt.h | 53 +++++++++++++++++++++++++++++
> drivers/iommu/intel/iommu.c | 2 ++
> drivers/iommu/iommu.c | 25 ++++++++++++++
> drivers/vfio/vfio_iommu_type1.c | 17 +++++++--
If you care about performance use iommufd, and patches like this have
to update iommfd too.
> static const struct iommu_domain_ops amdv1_ops = {
> IOMMU_PT_DOMAIN_OPS(amdv1),
> + IOMMU_PT_PGSIZE_OPS(amdv1),
> .iotlb_sync_map = amd_iommu_iotlb_sync_map,
> .flush_iotlb_all = amd_iommu_flush_iotlb_all,
> .iotlb_sync = amd_iommu_iotlb_sync,
It should be routed through the private ops structure, and even if not
no reason to make another macro.
This would also need to be split into a patch adding the core API
function and then updating callers.
Jason