Re: [PATCH RFT RFC] usb: xhci: Kill hosts with HCE or HSE on command timeout
From: Baolu Lu
Date: Fri Jun 19 2026 - 23:42:10 EST
On 6/18/2026 8:57 AM, Desnes Nunes wrote:
Hello IOMMU mailing list,
On Wed, Jun 10, 2026 at 12:32 PM Desnes Nunes<desnesn@xxxxxxxxxx> wrote:
I have just found out the solution for the bug....
In scalable mode, a PCI bus may populate only the upper root half0001-iommu-vt-d-Fix-UCTP-context-table-slot-when-copying-.rfc.patch
(UCTP) when all devices on that bus have devfn >= 0x80. On bus 0x80, I
have e1000e at 80:1f.6 (devfn 0xfe) and xHCI at 80:14.0 (devfn 0xa0),
so the hardware root entry correctly has lo=0 and hi=UCTP present.
However, after copy_translation_tables(), I noticed that root[128].hi
was zeroed-out (Present bit cleared) and another (expected) different
value on root[128].lo.
In short, the culprit here is having a zeroed LCTP, since at
copy_context_table() the allocation of new_ce for LCTP context entries
currently governs the pos variable; which is later used to save new_ce
entries for UCTP at tbl[tbl + pos].
On the first iteration idx will be zero, old_ce_phys will be empty,
thus this moves the loop straight to devfn=0x80. At devfn 0x80, idx
wraps to 0 again ( (devfn * 2) mod 256), but since no new_ce was
previouly allocated for LCTP context entries, pos will remain zero
while copying UCTP context entries. After all upper context entries
are saved, tbl will receive new_ce from UCTP at tbl[tbl_idx + 0], and
not tbl[tbl_idx + 1]. These will be later written in
copy_translation_tables() to iommu->root_entry[bus].lo and
iommu->root_entry[bus].hi, which causes the bug.
In summary, the hardware tables were correct, but the copy path
misplaced the UCTP table for bus 0x80 when dealing with a LCTP
zeroed-out during kdump.
To fix this, I created a v3 patch that uses devfn to better track
which half we are copying, so UCTP-only buses (lo=0, hi=P) are
installed into the upper root half.
I am doing some final tests now, but since this was a lot to digest,FYI, all of my last tests looked OK.
comments at this stage will be most appreciated.
To IOMMU maintainers: should I send this patch to the iommu mailing
list and move the discussion there?
Yes, absolutely. The iommu mailing list is the right place to discuss
bugs and fixes, so please go ahead.
I meant as a new submission to IOMMU maling list, since this started
in xHCI at the usb mailing list.
Of course, that is if nobody has any comments or objections to the patch.
Thanks,
baolu