Re: [RFC PATCH] Optimize VFIO and IOMMU mapping traversal

Next message: Arnd Bergmann: "Re: [PATCH v5 3/7] firmware: samsung: acpm: Fix dummy stubs to return ERR_PTR"
Previous message: Mark Brown: "Fixes tags need work in the arm-soc-fixes tree"
In reply to: Guanghui Feng: "[RFC PATCH] Optimize VFIO and IOMMU mapping traversal"
Next in thread: Guanghui Feng: "[PATCH 0/9] iommu: introduce iova_to_phys_length for efficient IOVA-to-physical translation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Jason Gunthorpe

Date: Fri May 29 2026 - 07:56:15 EST

On Fri, May 29, 2026 at 03:09:32PM +0800, Guanghui Feng wrote:
> In VFIO, vfio_unmap_unpin requires performing iommu unmap and mm
> unpin on the address space. However, VFIO doesn't record the PHY
> address corresponding to iova, but instead obtains the iova-PHY
> mapping through iommu_iommu_iova_to_phys.
>
> In IOMMU, under conditions such as address alignment, it prioritizes
> mapping iova-PHY based on bigpages. Therefore, during the
> vfio_unmap_unpin process, traversal can be performed at the
> granularity of the IOMMU map, reducing the number of
> iommu_iova_to_phys queries and significantly improving conversion
> efficiency.
>
> Therefore, an iommu_iova_to_pgsize implementation is added to the
> IOMMU driver to return the pagesize used for the iova mapping.

This is the wrong API, what we need here is an iova_to_phys variation
that returns a size so physically contiguous IOPTEs can be joined.

> drivers/iommu/amd/iommu.c | 2 ++
> drivers/iommu/generic_pt/iommu_pt.h | 53 +++++++++++++++++++++++++++++
> drivers/iommu/intel/iommu.c | 2 ++
> drivers/iommu/iommu.c | 25 ++++++++++++++
> drivers/vfio/vfio_iommu_type1.c | 17 +++++++--

If you care about performance use iommufd, and patches like this have
to update iommfd too.

> static const struct iommu_domain_ops amdv1_ops = {
> IOMMU_PT_DOMAIN_OPS(amdv1),
> + IOMMU_PT_PGSIZE_OPS(amdv1),
> .iotlb_sync_map = amd_iommu_iotlb_sync_map,
> .flush_iotlb_all = amd_iommu_flush_iotlb_all,
> .iotlb_sync = amd_iommu_iotlb_sync,

It should be routed through the private ops structure, and even if not
no reason to make another macro.

This would also need to be split into a patch adding the core API
function and then updating callers.

Jason