Re: [PATCH v1] iommu: Skip mapping at address 0x0 if it already exists

From: Antheas Kapenekakis

Date: Wed Feb 25 2026 - 17:05:56 EST


On Wed, 25 Feb 2026 at 19:27, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> On 2026-02-24 7:51 pm, Antheas Kapenekakis wrote:
> > On Tue, 24 Feb 2026 at 20:43, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >>
> >> On Tue, Feb 24, 2026 at 08:33:50PM +0100, Antheas Kapenekakis wrote:
> >>> On Tue, 24 Feb 2026 at 20:23, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >>>>
> >>>> On Sun, Feb 22, 2026 at 12:50:50AM +0100, Antheas Kapenekakis wrote:
> >>>>> Commit 789a5913b29c ("iommu/amd: Use the generic iommu page table")
> >>>>> introduces the shared iommu page table for AMD IOMMU. Some bioses
> >>>>> contain an identity mapping for address 0x0, which is not parsed
> >>>>> properly (e.g., certain Strix Halo devices). This causes the DMA
> >>>>> components of the device to fail to initialize (e.g., the NVMe SSD
> >>>>> controller), leading to a failed post.
> >>>>
> >>>> I'm trying to understand the issue here, is it that the old AMD code
> >>>> incorrectly succeeded iommu_map() on top of an existing mapping while
> >>>> the new code returns -EADDRINUSE?
> >>>>
> >>>> Then the existing guard for double mapping doesn't work since 0 is an
> >>>> ambiguous return?
> >>>
> >>> Hi Jason,
> >>>
> >>> It seems like the previous code correctly handled the 0 case, and the
> >>> new code does not due to the ambiguous return.
> >>
> >> I checked the old AMD driver it is definately returning 0 from
> >> iommu_iova_to_phys(), so that hasn't changed.
> >>
> >>>> reasonabe. Just please clean up the commit message to be a bit clearer
> >>>> on what caused this regression and add a short comment above the new if:
> >>>>
> >>>> 0 return from iommu_iova_to_phys() is ambiguous, it could mean a
> >>>> present mapping. EADDRINUSE is reliable when supported, it means the
> >>>> IOVA is already mapped, so ignore it to resolve the ambiguity.
> >>>
> >>> If you think that this patch is a long term solution, I can drop the
> >>> print, the addr == 0 check, add a comment on top, then repost a V2
> >>> with an updated description to be more concrete.
> >>
> >> I think it is a good solution as-is, I just want clarity on what
> >> actually regressed here, and I think it is that iommu_map() behavior
> >> changed.
> >
> > Reverting 789a5913b29c ("iommu/amd: Use the generic iommu page table")
> > plus two follow-up fixes also worked.
> >
> > Looking through it, the removed amd_iommu_map_pages() /
> > amd_iommu_iova_to_phys() in the previous ops might have behaved
> > differently.
>
> Yes, the old AMD code would happily replace an existing mapping without
> question (but whether it would correctly invalidate TLBs if just
> replacing a leaf entry with a different leaf entry at the same level
> I'm not entirely sure...)
>
> However, waiting for a mapping to fail (which may scream its own
> warning) and then attempting to paper over it seems like an awful bodge.
> Why not just do something like this?
>
> Cheers,
> Robin.
>
> ----->8-----
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 35db51780954..682e95c94236 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1212,8 +1212,11 @@ static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
>
> if (addr == end)
> goto map_end;
> -
> - phys_addr = iommu_iova_to_phys(domain, addr);
> + /*
> + * Check for existing translations using an offset, to
> + * distinguish a mapping of the page at PA 0 from none.
> + */
> + phys_addr = iommu_iova_to_phys(domain, addr + 1);

Hi Robin,
Assuming pg_size != 1, which it seems to always be due to `1UL <<
__ffs(domain->pgsize_bitmap)` is an interesting solution. Then again,
it assumes that the underlying implementations for iommu_iova_to_phys
would respect translating unaligned addresses. I cannot comment on
that.

I would tend to consider both solutions not ideal, and edge towards
Jason's suggestion of universally ignoring EADDRINUSE without an
error. However, I am not an absolutist in that regard.

I am on a business trip and do not have the SMTP credentials with me
for git, so I will send a V2 tomorrow. I already drafted it. I will
make sure to cc new participants on this thread if not already
included.

As an aside, it seems that the issue fixed by this patch is not the
only one introduced in 6.19, with [1] having users where this patch
does not help. It seems there are additional issues with thunderbolt
docks. iommu_map could be the culprit and perhaps forceful remapping,
as was done previously for AMD would be a more universal solution.
This way, EADDRINUSE stops being an error.

Best,
Antheas

[1] https://github.com/CachyOS/linux-cachyos/issues/704

> if (!phys_addr) {
> map_size += pg_size;
> continue;
>
>