Re: [PATCH] usb: xhci: Assume that endpoints halt as specified

From: Mathias Nyman

Date: Tue Nov 11 2025 - 07:13:21 EST


On 11/7/25 12:13, Michal Pecio wrote:
xHCI 4.8.3 recommends that software should simply assume endpoints to
halt after certain events, without looking at the Endpoint Context for
confirmation, because HCs may be slow to update that.

While no cases of such "slowness" appear to be known, different problem
exists on AMD Promontory chipsets: they may halt and generate a transfer
event, but fail to ever update the Endpoint Context at all, at least not
until some command is queued and fails with Context State Error. This is
easily triggered by disconnecting D- of a full speed serial device.

Possibly similar bug in non-AMD hardware has been reported to linux-usb.

In such case, failed TD is given back without erasing from the ring and
endpoint isn't reset. If some URB is unlinked later, Stop Endpoint fails
and its handler resets the endpoint. On next submission it will restart
on the stale TD. Outcome is UAF on success, or another halt on error and
then Dequeue doesn't move and URBs are stuck. Unlinking and resubmitting
the URBs causes unlimited ring expansion if the situation repeats.

This can be solved by ignoring Endpoint Context State and trusting that
endpoints halt when required, except one known case in ancient hardware.
The check for "Already resolving halted ep" becomes redundant, because
for these completion codes we now jump to xhci_handle_halted_endpoint()
which deals with pending EP_HALTED internally.

Link: https://lore.kernel.org/linux-usb/20250311234139.0e73e138@foxbook/
Link: https://lore.kernel.org/linux-usb/20250918055527.4157212-1-zhangjinpeng@xxxxxxxxxx/
Signed-off-by: Michal Pecio <michal.pecio@xxxxxxxxx>

Makes sense, I guess we can only trust hardware to update the state in
the endpoint context on specific command completions, not transfer events.

Added to queue, thanks
Mathias