Re: suspend problem with dwc3 gadget / g_multi when usb disconnected
From: Felipe Balbi
Date: Thu Dec 12 2019 - 07:39:12 EST
Hi,
Martin Kepplinger <martin.kepplinger@xxxxxxx> writes:
>>> https://source.puri.sm/martin.kepplinger/linux-next/blob/next-20191205/librem5/arch/arm64/configs/librem5_defconfig
>>>
>>> So I want to have host and gadget mode (G_MULTI, most importantly a
>>> serial ACM console), and I'm having trouble with S3 suspend:
>>>
>>> * The most interesting case here: When I boot with USB disconnected
>>> (from the battery) I can't suspend or unload dwc3. After "echo mem >
>>> /sys/power/state" I get nothing in the my logs except
>>>
>>> [ 18.107380] PM: suspend entry (deep)
>>>
>>> and the system hangs. similarly, I can't work around this by rmmod -f
>>> dwc3. What can go wrong here? I don't know enough about usb hci and why
>>> this shouldn't work. What can I do to help you here?
>>>
>>> * When I boot with USB connected, then unplug USB, I get:
>>>
>>> [ 19.289602] dwc3 38100000.usb: request 000000006a19695c was not
>>> queued to ep3in
>>
>> please capture ftrace logs from dwc3. We have documentation for this:
>>
>> https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html
>>
>> (note to self: update email address :-)
>
> thanks for getting back at me! I hope I won't confuse you by seemingly
> different issues:
no worries :-)
> First, the above message again: I see it (on -next-20191212) every time
> a few seconds after startup, also when keeping usb connected:
>
> [ 38.375833] dwc3 38100000.usb: request 0000000051c55cfc was not
> queued to ep3in
Okay. This could be indicative of a race condition; we'll see
> and I append the "full" trace with that timestamp included. Just to
> paste a (maybe) relevant part here:
>
>
> irq/41-dwc3-310 [000] d..2 38.375802: dwc3_writel: addr
> 0000000023f39630 value 00050c08
> irq/41-dwc3-310 [000] d..2 38.375803: dwc3_readl: addr
> 0000000023f39630 value 00050808
> irq/41-dwc3-310 [000] d..2 38.375804: dwc3_gadget_ep_cmd:
> ep2in: cmd 'End Transfer' [50c08] params 00000000 00000000 00000000 -->
> status: Successful
> irq/41-dwc3-310 [000] d..2 38.375805: dwc3_readl: addr
> 000000002c8ca7cf value 40102400
> irq/41-dwc3-310 [000] d..2 38.375806: dwc3_writel: addr
> 000000002c8ca7cf value 40102540
> irq/41-dwc3-310 [000] d..2 38.375808: dwc3_readl: addr
> 00000000ad4ae082 value 00000003
> irq/41-dwc3-310 [000] d..2 38.375809: dwc3_writel: addr
> 00000000ad4ae082 value 00000003
> irq/41-dwc3-310 [000] d..2 38.375815: dwc3_ep_dequeue: ep3in:
> req 0000000051c55cfc length 0/0 zsI ==> 0
> irq/41-dwc3-310 [000] d..2 38.383461: dwc3_free_request:
> ep3in: req 0000000051c55cfc length 0/0 zsI ==> 0
> irq/41-dwc3-310 [000] d..2 38.383480: dwc3_gadget_ep_disable:
> ep2out: mps 512/1024 streams 15 burst 0 ring 16/11 flags E:swBp:>
irq/41-dwc3-310 [000] d..1 38.375648: dwc3_event: event (00000101): Reset [U0]
I don't see why we're getting this reset here.
irq/41-dwc3-310 [000] d..2 38.375664: dwc3_gadget_ep_disable: ep1in: mps 512/1024 streams 15 burst 0 ring 14/14 flags E:swBp:<
all endpoints are disabled and they're requests are freed. Nothing wrong there.
irq/41-dwc3-310 [000] d..2 38.375681: dwc3_gadget_ep_cmd: ep1in: cmd 'End Transfer' [30c08] params 00000000 00000000 00000000 --> status: Successful
irq/41-dwc3-310 [000] d..1 38.375690: dwc3_free_request: ep1in: req 000000007144a82a length 114/114 ZsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375698: dwc3_free_request: ep1in: req 0000000084a5d489 length 134/134 ZsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375700: dwc3_free_request: ep1in: req 000000004fd6d4f7 length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375702: dwc3_free_request: ep1in: req 0000000067b7b70a length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375704: dwc3_free_request: ep1in: req 00000000e19d65f0 length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375717: dwc3_free_request: ep1in: req 0000000005e9931c length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375719: dwc3_free_request: ep1in: req 0000000036de964f length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375721: dwc3_free_request: ep1in: req 0000000090a35af0 length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375723: dwc3_free_request: ep1in: req 000000002289a9c8 length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..1 38.375724: dwc3_free_request: ep1in: req 00000000113f6f0d length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..2 38.375726: dwc3_gadget_ep_disable: ep1out: mps 512/1024 streams 15 burst 0 ring 20/11 flags E:swBp:>
irq/41-dwc3-310 [000] d..2 38.375735: dwc3_gadget_ep_cmd: ep1out: cmd 'End Transfer' [20c08] params 00000000 00000000 00000000 --> status: Successful
irq/41-dwc3-310 [000] d..2 38.375743: dwc3_gadget_giveback: ep1out: req 00000000c4551c5c length 4294958658/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375755: dwc3_gadget_giveback: ep1out: req 00000000fbe64702 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375758: dwc3_gadget_giveback: ep1out: req 000000000d2e5947 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375761: dwc3_gadget_giveback: ep1out: req 000000003ae2da7d length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375763: dwc3_gadget_giveback: ep1out: req 0000000060248ce8 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375766: dwc3_gadget_giveback: ep1out: req 000000008d92c239 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375768: dwc3_gadget_giveback: ep1out: req 00000000b73f8103 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375771: dwc3_gadget_giveback: ep1out: req 00000000245d0fca length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375774: dwc3_gadget_giveback: ep1out: req 00000000a62e97dd length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375776: dwc3_gadget_giveback: ep1out: req 00000000af48ce89 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375781: dwc3_free_request: ep1out: req 00000000af48ce89 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375782: dwc3_free_request: ep1out: req 00000000a62e97dd length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375784: dwc3_free_request: ep1out: req 00000000245d0fca length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375785: dwc3_free_request: ep1out: req 00000000b73f8103 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375786: dwc3_free_request: ep1out: req 000000008d92c239 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375788: dwc3_free_request: ep1out: req 0000000060248ce8 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375789: dwc3_free_request: ep1out: req 000000003ae2da7d length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375791: dwc3_free_request: ep1out: req 000000000d2e5947 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375792: dwc3_free_request: ep1out: req 00000000fbe64702 length 0/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..1 38.375794: dwc3_free_request: ep1out: req 00000000c4551c5c length 4294958658/1578 zsI ==> -108
irq/41-dwc3-310 [000] d..2 38.375795: dwc3_gadget_ep_disable: ep2in: mps 8/1024 streams 15 burst 0 ring 5/5 flags E:swBp:<
irq/41-dwc3-310 [000] d..2 38.375804: dwc3_gadget_ep_cmd: ep2in: cmd 'End Transfer' [50c08] params 00000000 00000000 00000000 --> status: Successful
irq/41-dwc3-310 [000] d..2 38.375815: dwc3_ep_dequeue: ep3in: req 0000000051c55cfc length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..2 38.383461: dwc3_free_request: ep3in: req 0000000051c55cfc length 0/0 zsI ==> 0
irq/41-dwc3-310 [000] d..2 38.383480: dwc3_gadget_ep_disable: ep2out: mps 512/1024 streams 15 burst 0 ring 16/11 flags E:swBp:>
irq/41-dwc3-310 [000] d..2 38.383494: dwc3_gadget_ep_cmd: ep2out: cmd 'End Transfer' [40c08] params 00000000 00000000 00000000 --> status: Successful
> Now, I still have a lot configured in (G_MULTI and much more) and will
> compare that to less later. But I see a second issue:
>
> On next-20191212 now when shutting down or doing rmmod with USB
> disconnected, I get the following and can always reproduce it:
>
> [ 64.393272] WARNING: CPU: 3 PID: 884 at
> drivers/usb/dwc3/gadget.c:2719
> dwc3_stop_active_transfer.constprop.38+0xe8/0x100 [dwc3]
I don't see this timestamp in the trace data. Didn't you capture it? In
any case, this happens when End Transfer command returns an error:
static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force,
bool interrupt)
{
struct dwc3 *dwc = dep->dwc;
struct dwc3_gadget_ep_cmd_params params;
u32 cmd;
int ret;
if (!(dep->flags & DWC3_EP_TRANSFER_STARTED))
return;
/*
* NOTICE: We are violating what the Databook says about the
* EndTransfer command. Ideally we would _always_ wait for the
* EndTransfer Command Completion IRQ, but that's causing too
* much trouble synchronizing between us and gadget driver.
*
* We have discussed this with the IP Provider and it was
* suggested to giveback all requests here, but give HW some
* extra time to synchronize with the interconnect. We're using
* an arbitrary 100us delay for that.
*
* Note also that a similar handling was tested by Synopsys
* (thanks a lot Paul) and nothing bad has come out of it.
* In short, what we're doing is:
*
* - Issue EndTransfer WITH CMDIOC bit set
* - Wait 100us
*
* As of IP version 3.10a of the DWC_usb3 IP, the controller
* supports a mode to work around the above limitation. The
* software can poll the CMDACT bit in the DEPCMD register
* after issuing a EndTransfer command. This mode is enabled
* by writing GUCTL2[14]. This polling is already done in the
* dwc3_send_gadget_ep_cmd() function so if the mode is
* enabled, the EndTransfer command will have completed upon
* returning from this function and we don't need to delay for
* 100us.
*
* This mode is NOT available on the DWC_usb31 IP.
*/
cmd = DWC3_DEPCMD_ENDTRANSFER;
cmd |= force ? DWC3_DEPCMD_HIPRI_FORCERM : 0;
cmd |= interrupt ? DWC3_DEPCMD_CMDIOC : 0;
cmd |= DWC3_DEPCMD_PARAM(dep->resource_index);
memset(¶ms, 0, sizeof(params));
ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms);
WARN_ON_ONCE(ret);
dep->resource_index = 0;
if (dwc3_is_usb31(dwc) || dwc->revision < DWC3_REVISION_310A)
udelay(100);
}
> [ 64.405226] Modules linked in: rfkill st_magn_spi st_sensors_spi
> mousedev qmi_wwan caam_jr caamhash_desc caamalg_desc cdc_wdm option
> usbnet usb_wwan mii usbserial st_magn_i2c st_magn st_sensors_i2c
> st_sensors industrialio_triggered_buffer goodix kfifo_buf tcpci tcpm
> roles imx2_wdt watchdog aes_ce_blk crypto_simd typec bq25890_charger
> caam imx_sdma virt_dma error crct10dif_ce ghash_ce sha2_ce sha1_ce
> gpio_vibra usb_f_acm u_serial usb_f_rndis g_multi usb_f_mass_storage
> u_ether libcomposite ip_tables x_tables ipv6 nf_defrag_ipv6
> xhci_plat_hcd xhci_hcd usbcore dwc3 snvs_pwrkey ulpi udc_core
> phy_fsl_imx8mq_usb usb_common
> [ 64.462126] CPU: 3 PID: 884 Comm: ip Not tainted
> 5.5.0-rc1-next-20191212-librem5-00019-g4fb81fdd9c7d #1
> [ 64.471909] Hardware name: Purism Librem 5 devkit (DT)
> [ 64.477264] pstate: 60000085 (nZCv daIf -PAN -UAO)
> [ 64.482268] pc : dwc3_stop_active_transfer.constprop.38+0xe8/0x100 [dwc3]
> [ 64.489351] lr : dwc3_stop_active_transfer.constprop.38+0x64/0x100 [dwc3]
> [ 64.496421] sp : ffff00009f213240
> [ 64.499874] x29: ffff00009f213240 x28: 0000000000000000
> [ 64.505410] x27: ffff00009f2137b0 x26: ffff0000a4e0cc10
> [ 64.510946] x25: 0000000000000001 x24: ffff0000a324518c
> [ 64.516482] x23: ffff0000a3245193 x22: ffff0000a41f0080
> [ 64.522018] x21: 0000000000000000 x20: ffff0000a4176600
> [ 64.527553] x19: ffff800010ee87c8 x18: 0000000000000000
> [ 64.533089] x17: 0000000000000000 x16: 0000000000000000
> [ 64.538625] x15: 0000000000000000 x14: 0000000000000000
> [ 64.544160] x13: ffff0000a4176868 x12: ffff0000a4176868
> [ 64.549696] x11: ffffffffffff3f08 x10: 0000000000000008
> [ 64.555232] x9 : ffffffffffff3f04 x8 : 0000000000000811
> [ 64.560767] x7 : ffffffffffff3f00 x6 : ffffffffffff3f0c
> [ 64.566303] x5 : 0000000000000000 x4 : ffff80001007483c
> [ 64.571839] x3 : 0000000000000808 x2 : ffff00009f21327c
> [ 64.577375] x1 : 0000000000000040 x0 : 00000000ffffff92
> [ 64.582911] Call trace:
> [ 64.585474] dwc3_stop_active_transfer.constprop.38+0xe8/0x100 [dwc3]
> [ 64.592194] __dwc3_gadget_ep_disable+0x34/0x380 [dwc3]
> [ 64.597649] dwc3_gadget_ep_disable+0x44/0xf8 [dwc3]
> [ 64.602841] usb_ep_disable+0x34/0x100 [udc_core]
> [ 64.607749] eth_stop+0xb4/0x130 [u_ether]
> [ 64.612026] __dev_close_many+0xb4/0x138
> [ 64.616116] __dev_change_flags+0xb8/0x1d0
> [ 64.620385] dev_change_flags+0x28/0x68
> [ 64.624384] do_setlink+0x30c/0xc90
> [ 64.628021] __rtnl_newlink+0x3f8/0x788
> [ 64.632018] rtnl_newlink+0x54/0x80
> [ 64.635654] rtnetlink_rcv_msg+0x128/0x370
> [ 64.639926] netlink_rcv_skb+0x60/0x120
> [ 64.643924] rtnetlink_rcv+0x1c/0x28
> [ 64.647650] netlink_unicast+0x1b8/0x278
> [ 64.651739] netlink_sendmsg+0x1ac/0x3b0
> [ 64.655829] ____sys_sendmsg+0x250/0x298
> [ 64.659918] ___sys_sendmsg+0x88/0xc8
> [ 64.663735] __sys_sendmsg+0x70/0xc0
> [ 64.667463] __arm64_sys_sendmsg+0x28/0x30
> [ 64.671736] el0_svc_common.constprop.3+0x98/0x170
> [ 64.676729] el0_svc_handler+0x20/0x28
> [ 64.680638] el0_sync_handler+0x134/0x1a0
> [ 64.684817] el0_sync+0x140/0x180
> [ 64.688271] ---[ end trace 4f6aa846a9c6f20c ]---
>
> ... again. at every shutdown or rmmod.
>
> I hope that this might be already helpful.
I need to see the trace events for this failure case too. You could add
ftrace_dump_on_oops to your cmdline and change the WARN_ONCE() to a
BUG_ON() or something.
--
balbi
Attachment:
signature.asc
Description: PGP signature