Re: [RFC] usb: dwc3: core: Fix RAM interface getting stuck during enumeration

From: Thinh Nguyen
Date: Thu Oct 12 2023 - 18:18:24 EST


On Fri, Oct 13, 2023, Krishna Kurapati PSSNV wrote:
>
>
> On 10/12/2023 11:29 PM, Thinh Nguyen wrote:
>
> > > -static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
> > > +int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
> > > {
> > > unsigned long flags;
> > > int ret;
> > > @@ -2701,7 +2701,7 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
> > > return ret;
> > > }
> > > -static int dwc3_gadget_soft_connect(struct dwc3 *dwc)
> > > +int dwc3_gadget_soft_connect(struct dwc3 *dwc)
> > > {
> > > int ret;
> > > @@ -3963,6 +3963,7 @@ static void dwc3_gadget_disconnect_interrupt(struct dwc3 *dwc)
> > > dwc3_gadget_dctl_write_safe(dwc, reg);
> > > dwc->connected = false;
> > > + dwc->cable_disconnected = true;
> > > dwc3_disconnect_gadget(dwc);
> > > @@ -4038,6 +4039,7 @@ static void dwc3_gadget_reset_interrupt(struct dwc3 *dwc)
> > > */
> > > dwc3_stop_active_transfers(dwc);
> > > dwc->connected = true;
> > > + dwc->cable_disconnected = false;
> > > reg = dwc3_readl(dwc->regs, DWC3_DCTL);
> > > reg &= ~DWC3_DCTL_TSTCTRL_MASK;
> > > --
> > > 2.42.0
> > >
> >
> > We can just reset the controller when there's End Transfer command
> > timeout as a failure recovery. No need to do what you're doing here.
> >
> Hi Thinh,
>
> That was what I initially wanted to do, but there were couple of reasons I
> wanted to take this approach:
>
> 1. We can't just reset the controller in midst of gadget_interrupt. We need
> to process it completely and then take action.

You can flag the driver so you can do the teardown/soft-reset at the
appropriate time.

>
> 2. The above log was seen on QRD variant of SM8550/SM8650 easily. But on
> other platforms of same targets, the issue comes up at some other instances
> of code, at a point where no IRQ is running. In such cases its not possible
> to accurately find out code portions and reset the controller. The way I
> confirmed that both platforms are having the same issue is:
>
> a. During cable disconnect, I am not receiving disconnect interrupt
> b. The reg dump is exactly same in both cases (BMU as well)
>
> So I felt it was better to fix it during cable disconnect because even if we
> remove cable, we are still in device mode only and in this case we can
> unblock suspend and also bring back controller to a known state.
>
> Let me know your thoughts on the above.
>

This issue happens outside of disconnect right? Did you account for port
reset?

The symptom should be the same. At some point, a command will be issued.
If a command timed out, then something is really wrong (especially End
Transfer command). We can attempt to recover base on this symptom then.

And you don't need to poll for timeout for this specific type of error.
Just read some known register like GSNPSID to see if it's invalid.

BR,
Thinh