Re: [PATCH v1] iommu: Skip mapping at address 0x0 if it already exists

From: Antheas Kapenekakis

Date: Fri Feb 27 2026 - 03:06:58 EST


On Fri, 27 Feb 2026 at 02:03, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
>
> On Thu, Feb 26, 2026 at 09:40:10PM +0100, Antheas Kapenekakis wrote:
> > I am still concerned about unaligned checks. It is a functional change
> > that can cause regressions in all devices. The approach of this patch
> > does not affect behavior in other devices. I would like for Jason to
> > weigh in.
>
> I think Robin's solution is very clever, but I share the concern
> regarding what all the implementations do.

I will send a V3 with a ternary instead. This way, only 0 is affected.
I will compile test now and test later today. I am impartial with either fix.

```
phys_addr = iommu_iova_to_phys(domain, addr ? addr : 1);
```


Antheas

> So, I fed this question to Claude. It did find two counter points (see
> below for the whole report I had it generate):
>
> Implementations that lose the offset
>
> s390-iommu (drivers/iommu/s390-iommu.c:989): After the 3-level ZPCI walk,
> returns pte & ZPCI_PTE_ADDR_MASK with no sub-page offset added back.
> iova_to_phys(0) and iova_to_phys(1) return the same page-aligned PA.
>
> mtk_iommu_v1 (drivers/iommu/mtk_iommu_v1.c:396): Looks up the PTE by
> iova >> PAGE_SHIFT (discarding offset), then returns pte & ~(page_size-1). No
> step adds the sub-page offset back.
>
> I checked myself and it seems correct. I didn't try to confirm that
> the cases it says are OK are in fact OK, but it paints a convincing
> picture.
>
> I doubt S390 uses this function you are fixing, and I have no idea
> about mtk. Below is also a diff how Claude thought to fix it, I didn't
> try to check it.
>
> So, I'd say if Robin is OK with these outliers then it a good and fine
> approach.
>
> Jason
>
> iova_to_phys Implementation Survey
>
> Entry point: iommu_iova_to_phys() in drivers/iommu/iommu.c:2502 calls
> domain->ops->iova_to_phys(domain, iova) via iommu_domain_ops.
>
> Category 1 — Delegates to io-pgtable
>
> These drivers hold an io_pgtable_ops * and call ops->iova_to_phys(ops, iova).
> The actual walk happens in one of the io-pgtable backends listed in Category
> 4.
>
> ----------------------------------------------------------------------------------------------------------
> Driver Function (file:line) Ops assignment Notes
> ------------- -------------------------------------------------- ----------------------- -----------------
> arm-smmu-v3 arm_smmu_iova_to_phys :3767 Pure delegation
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:3471
>
> arm-smmu arm_smmu_iova_to_phys :1655 S1 with
> v1/v2 drivers/iommu/arm/arm-smmu/arm-smmu.c:1387 FEAT_TRANS_OPS
> uses hw ATS1PR
> registers
> (CB_PAR),
> otherwise
> io-pgtable
>
> apple-dart apple_dart_iova_to_phys :1021 Pure delegation →
> drivers/iommu/apple-dart.c:531 io-pgtable-dart
>
> qcom_iommu qcom_iommu_iova_to_phys :605 Delegation with
> drivers/iommu/arm/arm-smmu/qcom_iommu.c:492 spinlock
>
> ipmmu-vmsa ipmmu_iova_to_phys drivers/iommu/ipmmu-vmsa.c:702 :895 Uses
> ARM_32_LPAE_S1
> format
>
> mtk_iommu mtk_iommu_iova_to_phys :1073 Delegation + 4GB
> drivers/iommu/mtk_iommu.c:861 mode PA remap
> fixup
> ----------------------------------------------------------------------------------------------------------
>
> Category 2 — Open-coded page table walk
>
> These drivers implement their own page table traversal without io-pgtable.
>
> ---------------------------------------------------------------------------------------------------------
> Driver Function (file:line) Ops assignment Walk structure
> ---------------- ------------------------------------ -------------------- ------------------------------
> sun50i-iommu sun50i_iommu_iova_to_phys :860 2-level (DTE → PTE)
> drivers/iommu/sun50i-iommu.c:662
>
> exynos-iommu exynos_iommu_iova_to_phys :1487 2-level (section/large/small
> drivers/iommu/exynos-iommu.c:1375 page)
>
> riscv-iommu riscv_iommu_iova_to_phys :1355 Sv39/48/57 via
> drivers/iommu/riscv/iommu.c:1280 riscv_iommu_pte_fetch (:1166)
>
> omap-iommu omap_iommu_iova_to_phys :1727 iopgtable_lookup_entry helper
> drivers/iommu/omap-iommu.c:1596 (super
> section/section/large/small)
>
> rockchip-iommu rk_iommu_iova_to_phys :1190 2-level (DTE → PTE)
> drivers/iommu/rockchip-iommu.c:651
>
> msm_iommu msm_iommu_iova_to_phys :709 Hardware walk: writes VA to
> drivers/iommu/msm_iommu.c:526 V2PPR register, reads PA from
> PAR register
>
> s390-iommu s390_iommu_iova_to_phys :1186 3-level ZPCI (region → segment
> drivers/iommu/s390-iommu.c:989 → page)
>
> tegra-smmu tegra_smmu_iova_to_phys :1010 2-level via
> drivers/iommu/tegra-smmu.c:806 tegra_smmu_pte_lookup
>
> mtk_iommu_v1 mtk_iommu_v1_iova_to_phys :593 Flat single-level table
> drivers/iommu/mtk_iommu_v1.c:396
>
> sprd-iommu sprd_iommu_iova_to_phys :423 Flat single-level table
> drivers/iommu/sprd-iommu.c:369
> ---------------------------------------------------------------------------------------------------------
>
> Category 3 — Special / trivial
>
> -------------------------------------------------------------------------------------------
> Driver Function (file:line) Ops assignment Mechanism
> -------------- ------------------------------------- ---------------------- ---------------
> fsl_pamu fsl_pamu_iova_to_phys :438 Identity:
> drivers/iommu/fsl_pamu_domain.c:172 returns iova
> (after aperture
> bounds check)
>
> virtio-iommu viommu_iova_to_phys :1105 Interval tree
> drivers/iommu/virtio-iommu.c:915 reverse lookup
> (no page table)
> -------------------------------------------------------------------------------------------
>
> Category 4 — io_pgtable_ops backends
>
> These implement struct io_pgtable_ops.iova_to_phys and are the ultimate walk
> functions called by Category 1 drivers.
>
> --------------------------------------------------------------------------------------------------
> Backend Function (file:line) Ops assignment Walk strategy
> ----------- ---------------------------------------- -------------------- ------------------------
> ARM LPAE arm_lpae_iova_to_phys :950 Visitor pattern via
> (64-bit) drivers/iommu/io-pgtable-arm.c:734 __arm_lpae_iopte_walk;
> covers ARM_64_LPAE_S1,
> S2, ARM_MALI_LPAE
>
> ARM v7s arm_v7s_iova_to_phys :716 Iterative do-while
> (32-bit) drivers/iommu/io-pgtable-arm-v7s.c:644 2-level; handles
> contiguous entries
>
> Apple DART dart_iova_to_phys :402 dart_get_last pre-walks
> drivers/iommu/io-pgtable-dart.c:336 to leaf table, then
> single lookup
> --------------------------------------------------------------------------------------------------
>
> Category 5 — generic_pt framework
>
> All these drivers use IOMMU_PT_DOMAIN_OPS(fmt) which routes iova_to_phys into
> the template function pt_iommu_<fmt>_iova_to_phys at
> drivers/iommu/generic_pt/iommu_pt.h:170. The walk uses pt_walk_range +
> PT_MAKE_LEVELS to generate a fully-inlined unrolled per-level walk; OA
> extracted via pt_entry_oa_exact.
>
> ---------------------------------------------------------------------------------
> Driver Ops struct (file:line) Format
> ---------------- ----------------------------------------------- ----------------
> AMD IOMMU v1 amdv1_ops drivers/iommu/amd/iommu.c:2662 amdv1
>
> AMD IOMMU v2 amdv2_ops drivers/iommu/amd/iommu.c:2740 x86_64
>
> Intel VT-d intel_fs_paging_domain_ops x86_64
> first-stage drivers/iommu/intel/iommu.c:3886
>
> Intel VT-d intel_ss_paging_domain_ops vtdss
> second-stage drivers/iommu/intel/iommu.c:3897
>
> iommufd selftest mock_domain_ops etc amdv1_mock /
> drivers/iommu/iommufd/selftest.c:403,411,425 amdv1
>
> KUnit wrapper pgtbl_ops Delegates to
> drivers/iommu/generic_pt/kunit_iommu_cmp.h:86 io_pgtable_ops
> for comparison
> testing
> ---------------------------------------------------------------------------------
>
> Sub-page offset handling
>
> When iova_to_phys(iova) is called with an IOVA that is not aligned to the
> start of the mapped page/block (e.g. iova_to_phys(1) when a 4KB page is mapped
> at IOVA 0), most implementations return the exact physical address including
> the sub-page offset (phys_base + offset). Two do not.
>
> Summary
>
> ------------------------------------------------------------------------------------------------------
> Implementation Offset preserved? Mechanism
> ------------------------- ------------------------- --------------------------------------------------
> arm_lpae (io-pgtable) YES iopte_to_paddr(pte) | (iova & (block_size-1))
>
> arm_v7s (io-pgtable) YES iopte_to_paddr(pte) | (iova & ~LVL_MASK)
>
> dart (io-pgtable) YES iopte_to_paddr(pte) | (iova & (pgsize-1))
>
> sun50i-iommu YES page_addr + FIELD_GET(GENMASK(11,0), iova)
>
> exynos-iommu YES *_phys(entry) + *_offs(iova) per granularity
>
> riscv-iommu YES pfn_to_phys(pfn) | (iova & (pte_size-1))
>
> omap-iommu YES (descriptor & mask) | (va & ~mask)
>
> rockchip-iommu YES pt_address(pte) + rk_iova_page_offset(iova)
>
> msm_iommu YES HW PAR register + VA low bits spliced back in
>
> s390-iommu NO pte & ZPCI_PTE_ADDR_MASK — offset discarded
>
> tegra-smmu YES SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova)
>
> mtk_iommu_v1 NO pte & ~(page_size-1) — offset discarded
>
> sprd-iommu YES (pte << PAGE_SHIFT) + (iova & (page_size-1))
>
> fsl_pamu YES (trivial) return iova — identity mapping
>
> virtio-iommu YES paddr + (iova - mapping->iova.start)
>
> generic_pt YES _pt_entry_oa_fast() | log2_mod(va, entry_lg2sz)
> ------------------------------------------------------------------------------------------------------
>
> Category 1 drivers (arm-smmu-v3, arm-smmu, apple-dart, qcom_iommu, ipmmu-vmsa,
> mtk_iommu) inherit the behavior of their io-pgtable backend — all preserve
> offset.
>
> Implementations that lose the offset
>
> s390-iommu (drivers/iommu/s390-iommu.c:989): After the 3-level ZPCI walk,
> returns pte & ZPCI_PTE_ADDR_MASK with no sub-page offset added back.
> iova_to_phys(0) and iova_to_phys(1) return the same page-aligned PA.
>
> mtk_iommu_v1 (drivers/iommu/mtk_iommu_v1.c:396): Looks up the PTE by
> iova >> PAGE_SHIFT (discarding offset), then returns pte & ~(page_size-1). No
> step adds the sub-page offset back.
>
>
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index c8d8eff5373d30..8db16989270cd8 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -401,7 +401,8 @@ static phys_addr_t mtk_iommu_v1_iova_to_phys(struct iommu_domain *domain, dma_ad
>
> spin_lock_irqsave(&dom->pgtlock, flags);
> pa = *(dom->pgt_va + (iova >> MT2701_IOMMU_PAGE_SHIFT));
> - pa = pa & (~(MT2701_IOMMU_PAGE_SIZE - 1));
> + pa = (pa & (~(MT2701_IOMMU_PAGE_SIZE - 1))) |
> + (iova & (MT2701_IOMMU_PAGE_SIZE - 1));
> spin_unlock_irqrestore(&dom->pgtlock, flags);
>
> return pa;
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index fe679850af2861..57d27f3a984ed6 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -1015,7 +1015,8 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
> pto = get_st_pto(ste);
> pte = READ_ONCE(pto[px]);
> if (pt_entry_isvalid(pte))
> - phys = pte & ZPCI_PTE_ADDR_MASK;
> + phys = (pte & ZPCI_PTE_ADDR_MASK) |
> + (iova & ~ZPCI_PTE_ADDR_MASK);
> }
> }
>