Re: [PATCH v2 3/5] usb: dwc3: gadget: Adjust IRQ management during soft disconnect/connect

From: Wesley Cheng
Date: Wed Jul 20 2022 - 14:51:13 EST


Hi Thinh,

On 7/14/2022 10:38 AM, Thinh Nguyen wrote:
On 7/12/2022, Wesley Cheng wrote:
Local interrupts are currently being disabled as part of aquiring the
spin lock before issuing the endxfer command. Leave interrupts enabled, so
that EP0 events can continue to be processed. Also, ensure that there are
no pending interrupts before attempting to handle any soft
connect/disconnect.

Fixes: 861c010a2ee1 ("usb: dwc3: gadget: Refactor pullup()")
Signed-off-by: Wesley Cheng <quic_wcheng@xxxxxxxxxxx>
---
drivers/usb/dwc3/gadget.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index a455f8d4631d..ee85b773e3fe 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1674,6 +1674,7 @@ static int __dwc3_gadget_get_frame(struct dwc3 *dwc)
static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt)
{
struct dwc3_gadget_ep_cmd_params params;
+ struct dwc3 *dwc = dep->dwc;
u32 cmd;
int ret;
@@ -1682,7 +1683,9 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
cmd |= interrupt ? DWC3_DEPCMD_CMDIOC : 0;
cmd |= DWC3_DEPCMD_PARAM(dep->resource_index);
memset(&params, 0, sizeof(params));
+ spin_unlock(&dwc->lock);
ret = dwc3_send_gadget_ep_cmd(dep, cmd, &params);
+ spin_lock(&dwc->lock);
WARN_ON_ONCE(ret);
dep->resource_index = 0;
@@ -2029,12 +2032,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
struct dwc3_ep *dep = to_dwc3_ep(ep);
struct dwc3 *dwc = dep->dwc;
- unsigned long flags;
int ret = 0;
trace_dwc3_ep_dequeue(req);
- spin_lock_irqsave(&dwc->lock, flags);
+ spin_lock(&dwc->lock);
list_for_each_entry(r, &dep->cancelled_list, list) {
if (r == req)
@@ -2073,7 +2075,7 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
request, ep->name);
ret = -EINVAL;
out:
- spin_unlock_irqrestore(&dwc->lock, flags);
+ spin_unlock(&dwc->lock);
return ret;
}
@@ -2489,9 +2491,7 @@ static int __dwc3_gadget_start(struct dwc3 *dwc);
static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
{
- unsigned long flags;
-
- spin_lock_irqsave(&dwc->lock, flags);
+ spin_lock(&dwc->lock);
dwc->connected = false;
/*
@@ -2506,10 +2506,10 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
reinit_completion(&dwc->ep0_in_setup);
- spin_unlock_irqrestore(&dwc->lock, flags);
+ spin_unlock(&dwc->lock);
ret = wait_for_completion_timeout(&dwc->ep0_in_setup,
msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT));
- spin_lock_irqsave(&dwc->lock, flags);
+ spin_lock(&dwc->lock);
if (ret == 0)
dev_warn(dwc->dev, "timed out waiting for SETUP phase\n");
}
@@ -2523,7 +2523,7 @@ static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc)
*/
dwc3_stop_active_transfers(dwc);
__dwc3_gadget_stop(dwc);
- spin_unlock_irqrestore(&dwc->lock, flags);
+ spin_unlock(&dwc->lock);
/*
* Note: if the GEVNTCOUNT indicates events in the event buffer, the
@@ -2569,6 +2569,8 @@ static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
return 0;
}
+ synchronize_irq(dwc->irq_gadget);
+
if (!is_on) {
ret = dwc3_gadget_soft_disconnect(dwc);
} else {
@@ -3729,6 +3731,7 @@ void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
*/
__dwc3_stop_active_transfer(dep, force, interrupt);
+
}
static void dwc3_clear_stall_all_ep(struct dwc3 *dwc)

Hi Greg,

Please don't pick up this patch yet. We're still in discussion with
this. I have some concern with unlocking/locking when sending End
Transfer command. For example, this patch may cause issues with
DWC3_EP_END_TRANSFER_PENDING checks.

Hi Wesley,

Did you try out my suggestion yet?


Just providing a quick update.

So with your suggestion, I was able to consistently reproduce the controller halt issue after a day or so of testing. However, when I took a further look, I believe the problem is due to the DWC3 event handler:

static void dwc3_endpoint_interrupt(struct dwc3 *dwc,
const struct dwc3_event_depevt *event)
{
...
if (!(dep->flags & DWC3_EP_ENABLED)) {
if (!(dep->flags & DWC3_EP_TRANSFER_STARTED))
return;

/* Handle only EPCMDCMPLT when EP disabled */
if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT)
return;
}

The soft disconnect routine reached to the run/stop polling point, and I could see that DWC3_EP_DELAYED_STOP was set, and we got a xfercomplete event for the STATUS phase. However, since we exit early in the event handler (due to __dwc3_gadget_stop() being called and disabling EP0), the STATUS complete is never handled, and we do not issue the endxfer command.

I don't think I saw this issue with my change, as we allowed the STATUS phase handling to happen BEFORE gadget stop was called (since I released the lock in the stop active transfers API).

However, I think even with my approach, we'd eventually run into a possibility of this issue, as we aren't truly handling EP0 events while polling for the halted status due to the above. It was just reducing the chances. The scenario of this issue is coming because the host took a long time to complete the STATUS phase, so we ran into a "timed out waiting for SETUP phase," which allowed us to call the run/stop routine while we were not yet in the SETUP phase.

I'm currently running a change to add a EP num check to this IF condition:

if ((epnum > 1) && !(dep->flags & DWC3_EP_ENABLED)) {
if (!(dep->flags & DWC3_EP_TRANSFER_STARTED))
return;

/* Handle only EPCMDCMPLT when EP disabled */
if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT)
return;
}

Thanks
Wesley Cheng