Re: Intel IOMMU bug: xHCI faults during crash kernel boot

From: Michal Pecio

Date: Thu Jun 18 2026 - 00:46:42 EST


On Wed, 17 Jun 2026 21:57:02 -0300, Desnes Nunes wrote:
> Hello IOMMU mailing list,
>
> On Wed, Jun 10, 2026 at 12:32 PM Desnes Nunes <desnesn@xxxxxxxxxx> wrote:
> >
> > I have just found out the solution for the bug.
> >
> ...
> > In scalable mode, a PCI bus may populate only the upper root half
> > (UCTP) when all devices on that bus have devfn >= 0x80. On bus
> > 0x80, I have e1000e at 80:1f.6 (devfn 0xfe) and xHCI at 80:14.0
> > (devfn 0xa0), so the hardware root entry correctly has lo=0 and
> > hi=UCTP present.
> >
> > However, after copy_translation_tables(), I noticed that
> > root[128].hi was zeroed-out (Present bit cleared) and another
> > (expected) different value on root[128].lo.
> >
> > In short, the culprit here is having a zeroed LCTP, since at
> > copy_context_table() the allocation of new_ce for LCTP context
> > entries currently governs the pos variable; which is later used to
> > save new_ce entries for UCTP at tbl[tbl + pos].
> > On the first iteration idx will be zero, old_ce_phys will be empty,
> > thus this moves the loop straight to devfn=0x80. At devfn 0x80, idx
> > wraps to 0 again ( (devfn * 2) mod 256), but since no new_ce was
> > previouly allocated for LCTP context entries, pos will remain zero
> > while copying UCTP context entries. After all upper context entries
> > are saved, tbl will receive new_ce from UCTP at tbl[tbl_idx + 0],
> > and not tbl[tbl_idx + 1]. These will be later written in
> > copy_translation_tables() to iommu->root_entry[bus].lo and
> > iommu->root_entry[bus].hi, which causes the bug.
> >
> > In summary, the hardware tables were correct, but the copy path
> > misplaced the UCTP table for bus 0x80 when dealing with a LCTP
> > zeroed-out during kdump.
> >
> > To fix this, I created a v3 patch that uses devfn to better track
> > which half we are copying, so UCTP-only buses (lo=0, hi=P) are
> > installed into the upper root half.
>
> 0001-iommu-vt-d-Fix-UCTP-context-table-slot-when-copying-.rfc.patch
>
> > I am doing some final tests now, but since this was a lot to digest,
> > comments at this stage will be most appreciated.
>
> FYI, all of my last tests looked OK.
>
> > To IOMMU maintainers: should I send this patch to the iommu mailing
> > list and move the discussion there?
>
> I meant as a new submission to IOMMU maling list, since this started
> in xHCI at the usb mailing list.
> Of course, that is if nobody has any comments or objections to the
> patch.

Looks like no one from IOMMU pays much attention in the first place.
Let's see if a subject change helps.

If you have a working patch which fixes this, just submit it following
usual rules in Documentation/process/submitting-patches.rst.

Regards,
Michal