Re: [RFC] usb: dwc3: core: Fix RAM interface getting stuck during enumeration

From: Wesley Cheng
Date: Thu Oct 12 2023 - 15:25:41 EST


Hi Krishna,

On 10/11/2023 3:02 AM, Krishna Kurapati wrote:
This implementation is to fix RAM interface getting stuck during
enumeration and controller not responding to any command.

During plug-out test cases, it is sometimes seen that no events
are generated by the controller and all CSR register reads give "0"
and CSR_Timeout bit gets set indicating that CSR reads/writes are
timing out or timed out.

The issue comes up on different instnaces of enumeration on different
platforms. On one platform, the debug log is as follows:

Prepared a TRB on ep0out and did start transfer to get set
address request from host:

<...>-7191 [000] D..1. 66.421006: dwc3_gadget_ep_cmd: ep0out:
cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 -->
status: Successful

<...>-7191 [000] D..1. 66.421196: dwc3_event: event (0000c040):
ep0out: Transfer Complete (sIL) [Setup Phase]

<...>-7191 [000] D..1. 66.421197: dwc3_ctrl_req: Set
Address(Addr = 01)

Then XFER NRDY received on ep0in for zero length status phase and
a Start Transfer was done on ep0in with 0-length packet in 2 Stage
status phase:

<...>-7191 [000] D..1. 66.421249: dwc3_event: event (000020c2):
ep0in: Transfer Not Ready [00000000] (Not Active) [Status Phase]

<...>-7191 [000] D..1. 66.421266: dwc3_prepare_trb: ep0in: trb
ffffffc00fcfd000 (E0:D0) buf 00000000efffa000 size 0 ctrl 00000c33
sofn 00000000 (HLcs:SC:status2)

<...>-7191 [000] D..1. 66.421387: dwc3_gadget_ep_cmd: ep0in: cmd
'Start Transfer' [406] params 00000000 efffa000 00000000 -->status:
Successful

Then a bus reset was received directly after 500 msec. Software never
got the cmd complete for the start transfer done in status phase. Here
the RAM interface is stuck. So host issues a bus reset as link is
idle for 500 msec:

<...>-7191 [000] D..1. 66.935603: dwc3_event: event (00000101):
Reset [U0]

Then software sees that it is in status phase and we issue an ENDXFER
on ep0in and it gets timedout waiting for the CMDACT to go '0':

<...>-7191 [000] D..1. 66.958249: dwc3_gadget_ep_cmd: ep0in: cmd
'End Transfer' [10508] params 00000000 00000000 00000000 --> status:
Timed Out

Upon debug with Synopsys, it turns out that the root cause is as
follows:

During any transfer, if the data is not successfully transmitted,
then a Done (with failure) handshake is returned, so that the BMU
can re-attempt the same data again by rewinding its data pointers.

But, if the USB IN is a 0-length payload (which is what is happening
in this case - 2 stage status phase of set_address), then there is no
need to rewind the pointers and the Done (with failure) handshake is
not returned for failure case. This keeps the Request-Done interface
busy till the next Done handshake. The MAC sends the 0-length payload
again when the host requests. If the transmission is successful this
time, the Done (with success) handshake is provided back. Otherwise,
it repeats the same steps again.

If the cable is disconnected or if the Host aborts the transfer on 3
consecutive failed attempts, the Request-Done handshake is not
complete. This keeps the interface busy.

The subsequent RAM access cannot proceed until the above pending
transfer is complete. This results in failure of any access to RAM
address locations. Many of the EndPoint commands need to access the
RAM and they would fail to complete successfully.

Furthermore when cable removal happens, this would not generate a
disconnect event and the "connected" flag remains true always blockin
suspend.

Synopsys confirmed that the issue is present on all USB3 devices and
as a workaround, suggested to re-initialize device mode.

Signed-off-by: Krishna Kurapati <quic_kriskura@xxxxxxxxxxx>
---
drivers/usb/dwc3/core.c | 20 ++++++++++++++++++++
drivers/usb/dwc3/core.h | 4 ++++
drivers/usb/dwc3/drd.c | 5 +++++
drivers/usb/dwc3/gadget.c | 6 ++++--
4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 44ee8526dc28..d18b81cccdc5 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -122,6 +122,7 @@ static void __dwc3_set_mode(struct work_struct *work)
unsigned long flags;
int ret;
u32 reg;
+ u8 timeout = 100;
u32 desired_dr_role;
mutex_lock(&dwc->mutex);
@@ -137,6 +138,25 @@ static void __dwc3_set_mode(struct work_struct *work)
if (!desired_dr_role)
goto out;
+ /*
+ * STAR 5001544 - If cable disconnect doesn't generate
+ * disconnect event in device mode, then re-initialize the
+ * controller.
+ */
+ if ((dwc->cable_disconnected == true) &&
+ (dwc->current_dr_role == DWC3_GCTL_PRTCAP_DEVICE)) {
+ while (dwc->connected == true && timeout != 0) {
+ mdelay(10);
+ timeout--;
+ }
+
+ if (timeout == 0) {
+ dwc3_gadget_soft_disconnect(dwc);
+ udelay(100);
+ dwc3_gadget_soft_connect(dwc);
+ }
+ }
+
if (desired_dr_role == dwc->current_dr_role)
goto out;
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index c6c87acbd376..7642330cf608 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1355,6 +1355,7 @@ struct dwc3 {
int last_fifo_depth;
int num_ep_resized;
struct dentry *debug_root;
+ bool cable_disconnected;
};
#define INCRX_BURST_MODE 0
@@ -1568,6 +1569,9 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc);
int dwc3_core_soft_reset(struct dwc3 *dwc);
+int dwc3_gadget_soft_disconnect(struct dwc3 *dwc);
+int dwc3_gadget_soft_connect(struct dwc3 *dwc);
+
#if IS_ENABLED(CONFIG_USB_DWC3_HOST) || IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE)
int dwc3_host_init(struct dwc3 *dwc);
void dwc3_host_exit(struct dwc3 *dwc);
diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c
index 039bf241769a..593c023fc39a 100644
--- a/drivers/usb/dwc3/drd.c
+++ b/drivers/usb/dwc3/drd.c
@@ -446,6 +446,8 @@ static int dwc3_usb_role_switch_set(struct usb_role_switch *sw,
struct dwc3 *dwc = usb_role_switch_get_drvdata(sw);
u32 mode;
+ dwc->cable_disconnected = false;
+
switch (role) {
case USB_ROLE_HOST:
mode = DWC3_GCTL_PRTCAP_HOST;
@@ -454,6 +456,9 @@ static int dwc3_usb_role_switch_set(struct usb_role_switch *sw,
mode = DWC3_GCTL_PRTCAP_DEVICE;
break;
default:
+ if (role == USB_ROLE_NONE)
+ dwc->cable_disconnected = true;
+

How do we handle cases where role switch isn't used? (ie extcon or maybe no cable connection notification at all)

Thanks
Wesley Cheng