Re: [PATCH] usb: xhci: Don't unchain link TRBs on quirky HCs

From: Mathias Nyman

Date: Tue Nov 11 2025 - 11:19:42 EST


On 11/7/25 12:08, Michal Pecio wrote:
Some old HCs ignore transfer ring link TRBs whose chain bit is unset.
This breaks endpoint operation and sometimes makes it execute other
ring's TDs, which may corrupt their buffers or cause unwanted device
action. We avoid this by chaining all link TRBs on affected rings.

Fix an omission which allows them to be unchained by cancelling TDs.

The patch was tested by reproducing this condition on an isochronous
endpoint (non-power-of-two TDs are sometimes split not to cross 64K)
and printing link TRBs in trb_to_noop() on good and buggy HCs.

Actual hardware malfunction is rare since it requires Missed Service
Error shortly before the unchained link TRB, at least on NEC and AMD.
I have never seen it after commit bb0ba4cb1065 ("usb: xhci: Apply the
link chain quirk on NEC isoc endpoints"), but it's Russian roulette
and I can't test all affected hosts and workloads. Fairly often MSEs
happen after cancellation because the endpoint was stopped.

Signed-off-by: Michal Pecio <michal.pecio@xxxxxxxxx>

Makes sense, thanks for fixing this

---
drivers/usb/host/xhci-ring.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a9e468ea19c5..fc0043ca85a4 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -128,11 +128,11 @@ static void inc_td_cnt(struct urb *urb)
urb_priv->num_tds_done++;
}
-static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
+static void trb_to_noop(union xhci_trb *trb, u32 noop_type, bool unchain_links)
{
if (trb_is_link(trb)) {
- /* unchain chained link TRBs */
- trb->link.control &= cpu_to_le32(~TRB_CHAIN);
+ if (unchain_links)
+ trb->link.control &= cpu_to_le32(~TRB_CHAIN);
} else {
trb->generic.field[0] = 0;
trb->generic.field[1] = 0;
@@ -465,7 +465,7 @@ static void xhci_handle_stopped_cmd_ring(struct xhci_hcd *xhci,
xhci_dbg(xhci, "Turn aborted command %p to no-op\n",
i_cmd->command_trb);
- trb_to_noop(i_cmd->command_trb, TRB_CMD_NOOP);
+ trb_to_noop(i_cmd->command_trb, TRB_CMD_NOOP, false);
/*
* caller waiting for completion is called when command
@@ -797,13 +797,18 @@ static int xhci_move_dequeue_past_td(struct xhci_hcd *xhci,
* (The last TRB actually points to the ring enqueue pointer, which is not part
* of this TD.) This is used to remove partially enqueued isoc TDs from a ring.
*/
-static void td_to_noop(struct xhci_td *td, bool flip_cycle)
+static void td_to_noop(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
+ struct xhci_td *td, bool flip_cycle)

we could avoid passing xhci pointer to td_to_noop() and just grab it from
the xhci_virt_ep structure instead. i.e. ep->xhci

Otherwise this looks good to me

Thanks
Mathias