Re: [PATCH] usb: dwc3: Disable AutoRetry controller feature for dwc_usb3 v2.00a
From: Thinh Nguyen
Date: Tue Jul 11 2023 - 17:49:07 EST
Hi Jakub,
On Thu, Jul 06, 2023, Jakub Vaněk wrote:
> On Wed, 2023-07-05 at 22:47 +0000, Thinh Nguyen wrote:
> > Hi,
> >
> > On Sat, Jul 01, 2023, Jakub Vanek wrote:
> > > AutoRetry has been found to cause issues in this controller revision.
> > > This feature allows the controller to send non-terminating/burst retry
> > > ACKs (Retry=1 and Nump!=0) as opposed to terminating retry ACKs
> > > (Retry=1 and Nump=0) to devices when a transaction error occurs.
> > >
> > > Unfortunately, some USB devices do not retry transfers when
> > > the controller sends them a non-terminating retry ACK. After the transfer
> > > times out, the xHCI driver tries to resume normal operation of the
> > > controller by sending a Stop Endpoint command to it. However, this
> > > revision of the controller fails to respond to that command in this
> > > state and the xHCI driver therefore gives up. This is reported via dmesg:
> > >
> > > [sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
> > > [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
> > > xhci-hcd: xHCI host not responding to stop endpoint command
> > > xhci-hcd: xHCI host controller not responding, assume dead
> > > xhci-hcd: HC died; cleaning up
> > >
> > > This problem has been observed on Odroid HC2. This board has
> > > an integrated JMS578 USB3-to-SATA bridge. The above problem could be
> > > triggered by starting a read-heavy workload on an attached SSD.
> > > After a while, the host controller would die and the SSD would disappear
> > > from the system.
> > >
> > > Reverting b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in
> > > the controller") stopped the issue from occurring on Odroid HC2.
> > > As this problem hasn't been reported on other devices, disable
> > > AutoRetry just for the dwc_usb3 revision v2.00a that the board SoC
> > > (Exynos 5422) uses.
> > >
> > > Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller")
> > > Link: https://urldefense.com/v3/__https://lore.kernel.org/r/a21f34c04632d250cd0a78c7c6f4a1c9c7a43142.camel@xxxxxxxxx/__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuAJ1zxnmw$ ;
> > > Link: https://urldefense.com/v3/__https://forum.odroid.com/viewtopic.php?t=42630__;!!A4F2R9G_pg!YWgF6oqfuVY6xstZmroukjlrwAFEYEQE8uj_iUu4fd10mnJxEPG345k75dMLLdNFP8rT1leok-aPNkz_FuCzOGIVPA$ ;
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Cc: Mauro Ribeiro <mauro.ribeiro@xxxxxxxxxxxxxx>
> > > Cc: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
> > > Suggested-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
> > > Signed-off-by: Jakub Vanek <linuxtardis@xxxxxxxxx>
> > > ---
> > > drivers/usb/dwc3/core.c | 10 ++++++++--
> > > 1 file changed, 8 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> > > index d68958e151a7..d742fc882d7e 100644
> > > --- a/drivers/usb/dwc3/core.c
> > > +++ b/drivers/usb/dwc3/core.c
> > > @@ -1218,9 +1218,15 @@ static int dwc3_core_init(struct dwc3 *dwc)
> > > * Host mode on seeing transaction errors(CRC errors or internal
> > > * overrun scenerios) on IN transfers to reply to the device
> > > * with a non-terminating retry ACK (i.e, an ACK transcation
> > > - * packet with Retry=1 & Nump != 0)
> > > + * packet with Retry=1 & Nump != 0).
> > > + *
> > > + * Do not enable this for DWC_usb3 v2.00a. This controller
> > > + * revision stops responding to Stop Endpoint commands if
> > > + * a USB device does not retry after a non-terminating retry
> > > + * ACK is sent to it.
> > > */
> > > - reg |= DWC3_GUCTL_HSTINAUTORETRY;
> > > + if (!DWC3_VER_IS(DWC3, 200A))
> > > + reg |= DWC3_GUCTL_HSTINAUTORETRY;
> > >
> > > dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
> > > }
> > > --
> > > 2.25.1
> > >
> >
> > I brought up this inquiry internally. Please wait as I need to review
> > this further. The handling for this may be different depending on the
> > team's feedback.
> >
>
> OK; feel free to contact me if I could be of any help.
>
Sorry for the delay. After discussion, this behavior is the same across
different controller versions. The consensus is to disable this feature
altoghether. There will not be any noticeable performance impact.
Would you mind revise the patch to disable this feature for all?
Many thanks,
Thinh