Re: [PATCH v2 1/2] usb: dwc3: gadget: Bail out in pullup if soft reset timeout happens
From: Thinh Nguyen
Date: Tue Apr 04 2023 - 17:44:23 EST
On Tue, Apr 04, 2023, Krishna Kurapati PSSNV wrote:
>
>
> On 4/4/2023 5:19 AM, Thinh Nguyen wrote:
> > On Thu, Mar 30, 2023, Krishna Kurapati PSSNV wrote:
> > >
> > >
> > > On 3/30/2023 5:40 AM, Thinh Nguyen wrote:
> > > > On Wed, Mar 29, 2023, Krishna Kurapati PSSNV wrote:
> > > > >
> > > > >
> > > > > On 3/29/2023 2:50 AM, Thinh Nguyen wrote:
> > > > > > Hi,
> > > > > >
> > > > > > On Tue, Mar 28, 2023, Krishna Kurapati wrote:
> > > > > > > If the core soft reset timeout happens, avoid setting up event
> > > > > > > buffers and starting gadget as the writes to these registers
> > > > > > > may not reflect when in reset and setting the run stop bit
> > > > > > > can lead the controller to access wrong event buffer address
> > > > > > > resulting in a crash.
> > > > > > >
> > > > > > > Signed-off-by: Krishna Kurapati <quic_kriskura@xxxxxxxxxxx>
> > > > > > > ---
> > > > > > > drivers/usb/dwc3/gadget.c | 5 ++++-
> > > > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > > > > > index 3c63fa97a680..f0472801d9a5 100644
> > > > > > > --- a/drivers/usb/dwc3/gadget.c
> > > > > > > +++ b/drivers/usb/dwc3/gadget.c
> > > > > > > @@ -2620,13 +2620,16 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
> > > > > > > * device-initiated disconnect requires a core soft reset
> > > > > > > * (DCTL.CSftRst) before enabling the run/stop bit.
> > > > > > > */
> > > > > > > - dwc3_core_soft_reset(dwc);
> > > > > > > + ret = dwc3_core_soft_reset(dwc);
> > > > > > > + if (ret)
> > > > > > > + goto done;
> > > > > > > dwc3_event_buffers_setup(dwc);
> > > > > > > __dwc3_gadget_start(dwc);
> > > > > > > ret = dwc3_gadget_run_stop(dwc, true, false);
> > > > > > > }
> > > > > > > +done:
> > > > > > > pm_runtime_put(dwc->dev);
> > > > > > > return ret;
> > > > > > > --
> > > > > > > 2.40.0
> > > > > > >
> > > > > >
> > > > > > I think there's one more place that may needs this check. Can you also
> > > > > > add this check in __dwc3_set_mode()?
> > > > >
> > > > > Hi Thinh,
> > > > >
> > > > > Sure. Will do it.
> > > > > Will the below be good enough ? Or would it be good to add an error/warn log
> > > > > there>
> > > >
> > > > There's already a warning message in dwc3_core_soft_reset() if it fails.
> > > >
> > > > >
> > > > > kriskura@hu-kriskura-hyd:/local/mnt/workspace/krishna/skales2/skales/kernel$
> > > > > git diff drivers/usb/
> > > > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> > > > > index 476b63618511..8d1d213d1dcd 100644
> > > > > --- a/drivers/usb/dwc3/core.c
> > > > > +++ b/drivers/usb/dwc3/core.c
> > > > > @@ -210,7 +210,9 @@ static void __dwc3_set_mode(struct work_struct *work)
> > > > > }
> > > > > break;
> > > > > case DWC3_GCTL_PRTCAP_DEVICE:
> > > > > - dwc3_core_soft_reset(dwc);
> > > > > + ret = dwc3_core_soft_reset(dwc);
> > > > > + if (ret)
> > > > > + goto out;
> > > > >
> > > > > dwc3_event_buffers_setup(dwc);
> > > > >
> > > >
> > > > If soft-reset failed, the controller is in a bad state. We should not
> > > > perform any further operation until the next hard reset. We should flag
> > > > the controller as dead. I don't think we have the equivalent of the
> > > > host's HCD_FLAG_DEAD. It may require some work in the UDC core. Perhaps
> > > > we can flag within dwc3 for now and prevent any further operation for a
> > > > simpler fix.
> > > >
> > > Hi Thinh,
> > >
> > > Are you referring that if __dwc3_set_mode failed with core soft reset
> > > timing out, the caller i.e., dwc3_set_mode who queues the work need to know
> > > that the operation actually failed. So we can add a flag to indicate that
> > > gadget is dead and the caller of dwc3_set_mode can check the flag to see if
> > > the operation is successful or not.
> > >
> > > Or am I misunderstanding your comment ?
> > >
> >
> > Not just in __dwc3_set_mode(). I mean any time dwc3_core_soft_reset
> > fails, then we set this flag. So that it can prevent the user calling
> > any gadget ops causing more crashes/invalid behavior. The
> > dwc->softconnect is already wrong on pullup() on failure.
> >
> > So that we can have a check in different gadget ops. For pullup():
> >
> > static int dwc3_gadget_pullup() {
> > if (dwc->udc_is_dead) {
> > dev_err(dev, "reset me. x_x \n");
> > return;
> > }
> >
> > abc();
> > }
> >
> > Perhaps the effort is probably the same if we enhance the UDC core for
> > this? In any case, I'm fine either way.
> >
> > Thanks,
> > Thinh
>
> Hi Thinh,
>
> So you don't want UDC to retry pullup if it fails the first time ? As per
> patch-2 of this series, I was trying to propagate the EITMEDOUT to UDC so
> that the caller (most probably configfs) can take appropriate action as to
> whether it must retry pullup or not.
>
Now I'm confused. If the soft-reset times out, that means that the
soft-reset (self-clearing) bit isn't cleared. How can we retry if it's
stuck in this state? My impression is that soft-reset would not complete
at all. Is that not the case for you, or it's simply because we need a
longer soft-reset timeout?
Thanks,
Thinh