Re: [PATCH v4 1/1] xhci: Correctly handle last TRB of isoc TD on Etron xHCI host

From: Mathias Nyman
Date: Wed Feb 05 2025 - 09:16:46 EST


On 5.2.2025 7.37, Kuangyi Chiang wrote:
Unplugging a USB3.0 webcam while streaming results in errors like this:

[ 132.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 132.646446] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start 000000002fdf8640 trb-end 000000002fdf8650 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0
[ 132.646560] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 18 comp_code 13
[ 132.646568] xhci_hcd 0000:03:00.0: Looking for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end 000000002fdf8670 seg-start 000000002fdf8000 seg-end 000000002fdf8ff0

If an error is detected while processing the last TRB of an isoc TD,
the Etron xHC generates two transfer events for the TRB where the
error was detected. The first event can be any sort of error (like
USB Transaction or Babble Detected, etc), and the final event is
Success.

The xHCI driver will handle the TD after the first event and remove it
from its internal list, and then print an "Transfer event TRB DMA ptr
not part of current TD" error message after the final event.

Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a
transaction error mid TD.") is designed to address isoc transaction
errors, but unfortunately it doesn't account for this scenario.

To work around this by reusing the logic that handles isoc transaction
errors, but continuing to wait for the final event when this condition
occurs. Sometimes we see the Stopped event after an error mid TD, this
is a normal event for a pending TD and we can think of it as the final
event we are waiting for.

Not giving back the TD when we get an event for the last TRB in the
TD sounds risky. With this change we assume all old and future ETRON hosts
will trigger this additional spurious success event.

I think we could handle this more like the XHCI_SPURIOUS_SUCCESS case seen
with short transfers, and just silence the error message.

Are there any other issues besides the error message seen?

Thanks
Mathias