Re: [PATCH v3] usb: dwc3: gadget: Prevent EP resource conflicts during StartTransfer
From: Thinh Nguyen
Date: Fri Mar 06 2026 - 16:43:08 EST
On Fri, Mar 06, 2026, Selvarasu Ganesan wrote:
>
> On 3/3/2026 6:09 AM, Thinh Nguyen wrote:
> > On Sat, Feb 28, 2026, Thinh Nguyen wrote:
> >> On Fri, Feb 27, 2026, Selvarasu Ganesan wrote:
> >>> The below “No resource for ep” warning appears when a StartTransfer
> >>> command is issued for bulk or interrupt endpoints in
> >>> `dwc3_gadget_ep_enable` while a previous StartTransfer on the same
> >>> endpoint is still in progress. The gadget functions drivers can invoke
> >>> `usb_ep_enable` (which triggers a new StartTransfer command) before the
> >>> earlier transfer has completed. Because the previous StartTransfer is
> >>> still active, `dwc3_gadget_ep_disable` can skip the required
> >>> `EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
> >>> resources are busy for previous StartTransfer and warning ("No resource
> >>> for ep") from dwc3 driver.
> >>>
> >>> Additionally, a race condition exists between dwc3_gadget_ep_disable()
> >>> and dwc3_gadget_ep_queue() when manipulating dep->flags. When
> >>> dwc3_gadget_ep_disable() calls dwc3_gadget_giveback(), the dwc->lock is
> >>> temporarily released. If dwc3_gadget_ep_queue() runs in that window, it
> >>> may set the DWC3_EP_TRANSFER_STARTED flag as part of
> >>> dwc3_send_gadget_ep_cmd(). When ep_disable resumes, it unconditionally
> >>> clears all flags except those explicitly masked, potentially clearing
> >>> DWC3_EP_TRANSFER_STARTED even though a new transfer has started. This
> >>> leads to "No resource for ep" warnings on subsequent StartTransfer
> >>> attempts.
> >>>
> >>> The underlying framework issue is that usb_ep_disable() is expected to
> >>> complete pending requests before returning, but is allowed to be called
> >>> from interrupt context where sleeping to wait for completion is not
> >>> possible.
> >>>
> >>> As temporary workarounds for this framework limitation:
> >>>
> >>> 1. In __dwc3_gadget_ep_enable(), add a check for the
> >>> DWC3_EP_TRANSFER_STARTED flag before issuing a new StartTransfer.
> >>> This prevents a second StartTransfer on an already busy endpoint,
> >>> eliminating the resource conflict.
> >>>
> >>> 2. In __dwc3_gadget_ep_disable(), preserve the DWC3_EP_TRANSFER_STARTED
> >>> flag when masking dep->flags if it is actually set, preventing the
> >>> race with dwc3_gadget_ep_queue() from corrupting the flag state.
> >>>
> >>> These changes eliminate the "No resource for ep" warnings and potential
> >>> kernel panics caused by panic_on_warn.
> >>>
> >>> dwc3 13200000.dwc3: No resource for ep1out
> >>> WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398 dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> >>> Call trace:
> >>> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> >>> __dwc3_gadget_ep_enable+0x490/0x7c0
> >>> dwc3_gadget_ep_enable+0x6c/0xe4
> >>> usb_ep_enable+0x5c/0x15c
> >>> mp_eth_stop+0xd4/0x11c
> >>> __dev_close_many+0x160/0x1c8
> >>> __dev_change_flags+0xfc/0x220
> >>> dev_change_flags+0x24/0x70
> >>> devinet_ioctl+0x434/0x524
> >>> inet_ioctl+0xa8/0x224
> >>> sock_do_ioctl+0x74/0x128
> >>> sock_ioctl+0x3bc/0x468
> >>> __arm64_sys_ioctl+0xa8/0xe4
> >>> invoke_syscall+0x58/0x10c
> >>> el0_svc_common+0xa8/0xdc
> >>> do_el0_svc+0x1c/0x28
> >>> el0_svc+0x38/0x88
> >>> el0t_64_sync_handler+0x70/0xbc
> >>> el0t_64_sync+0x1a8/0x1ac
> >>>
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx>
> >>> ---
> >>>
> >>> Note: No Fixes tag is added because this is a workaround for the
> >>> gadget framework issue where the gadget framework calls usb_ep_disable()
> >>> in interrupt context without ensuring endpoint flushing completes.
> >>> A proper fix requires refactoring the framework to make sure
> >>> usb_ep_disable is invoked in process context.
> >>>
> >>> Changes in v3:
> >>> - Revised the commit message to detail the real gadget framework issue
> >>> pointed out by the reviewer.
> >>> - Merged the two fixes for the same ep wringing into one patch.
> >>> Link to v2: https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20251117155920.643-1-selvarasu.g@xxxxxxxxxxx/__;!!A4F2R9G_pg!cQzQQ5kAWF6CE5hQe7VqFdnaxqwzsTB1ZGNT1GvCH28GoB_nESZR5Y2jtxdZBls6wBIM4OtpvG4dSaylvNC3qbh547k$
> >>>
> >>> Changes in v2:
> >>> - Removed change-id.
> >>> - Updated commit message.
> >>> Link to v1: https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20251117152812.622-1-selvarasu.g@xxxxxxxxxxx/__;!!A4F2R9G_pg!cQzQQ5kAWF6CE5hQe7VqFdnaxqwzsTB1ZGNT1GvCH28GoB_nESZR5Y2jtxdZBls6wBIM4OtpvG4dSaylvNC38z-CRD4$
> >>> ---
> >>> drivers/usb/dwc3/gadget.c | 22 ++++++++++++++++++++--
> >>> 1 file changed, 20 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >>> index 0a688904ce8c5..3af1bbfe3d92b 100644
> >>> --- a/drivers/usb/dwc3/gadget.c
> >>> +++ b/drivers/usb/dwc3/gadget.c
> >>> @@ -971,8 +971,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
> >>> * Issue StartTransfer here with no-op TRB so we can always rely on No
> >>> * Response Update Transfer command.
> >>> */
> >>> - if (usb_endpoint_xfer_bulk(desc) ||
> >>> - usb_endpoint_xfer_int(desc)) {
> >>> + if ((usb_endpoint_xfer_bulk(desc) ||
> >>> + usb_endpoint_xfer_int(desc)) &&
> >>> + !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
> >>> struct dwc3_gadget_ep_cmd_params params;
> >>> struct dwc3_trb *trb;
> >>> dma_addr_t trb_dma;
> >>> @@ -1096,6 +1097,23 @@ static int __dwc3_gadget_ep_disable(struct dwc3_ep *dep)
> >>> */
> >>> if (dep->flags & DWC3_EP_DELAY_STOP)
> >>> mask |= (DWC3_EP_DELAY_STOP | DWC3_EP_TRANSFER_STARTED);
> >>> +
> >>> + /*
> >>> + * When dwc3_gadget_ep_disable() calls dwc3_gadget_giveback(),
> >>> + * the dwc->lock is temporarily released. If dwc3_gadget_ep_queue()
> >>> + * runs in that window it may set the DWC3_EP_TRANSFER_STARTED flag as
> >>> + * part of dwc3_send_gadget_ep_cmd. The original code cleared the flag
> >>> + * unconditionally in the mask operation, which could overwrite the
> >>> + * concurrent modification.
> >>> + *
> >>> + * As a workaround for the interrupt context constraint where we cannot
> >>> + * wait for endpoint flushing, preserve the DWC3_EP_TRANSFER_STARTED
> >>> + * flag if it is set, avoiding resource conflicts until the framework
> >>> + * is fixed to properly synchronize endpoint lifecycle management.
> >>> + */
> >>> + if (dep->flags & DWC3_EP_TRANSFER_STARTED)
> >>> + mask |= DWC3_EP_TRANSFER_STARTED;
> >>> +
> >>> dep->flags &= mask;
> >>>
> >>> /* Clear out the ep descriptors for non-ep0 */
> >>> --
> >>> 2.34.1
> >>>
> >> Acked-by: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>
> >>
> > Oh wait, don't pick this patch up yet.
> >
> > This will cause a regression for UAS device. When switching alt-setting
> > interface for BOT to UASP, the device needs to issue a Start Transfer
> > command.
> >
> > This workaround won't work. Can we fix the usb_ep_disable() interface
> > and rework this instead?
> >
> > BR,
> > Thinh
>
> Hi Thinh,
>
> We’re trying to see how this change could cause a regression for UAS
> devices.
> Could you explain why the workaround might be a problem for UAS? Are you
> concerned that it could miss a valid StartTransfer when a previous
> transfer finishes later than expected as part of ep_disable?
In UAS, the device controller uses the first PRIME to synchronize with
the host to determine whether the Start Transfer command can initiate
the stream. After configuring an endpoint, if we issue the StartTransfer
command too late, then the device controller may not initiate the
transfer (sending ERDY), and host will not know when to start the
transfer.
So we have this workaround that we would have the device Start and Stop
the endpoint immediately just to arm the endpoint for UASP transfers. In
the newer IPs, this workaround may not be needed.
>
> If we don’t use this temporary fix, the driver can still report “EP
> resource busy” when an earlier StartTransfer hasn’t finished
> before ep_disable returns. That can happen when a UAS device needs to
> start a new transfer during ep_enable while the prior transfer is still
> pending.
>
> The patch simply blocks a second StartTransfer when the same endpoint
> already has a transfer in progress to prevent a “EP resource busy” issue.
>
> And it can cause a new StartTransfer to be issued later from ep_queue
> while the starttransfer that should have been started during ep_enable
> is skipped.
>
Also, if the gadget driver just uses usb_ep_disable() to handle the
teardown instead of proactively dequeuing all the active requests, then
the DWC3_EP_TRANSFER_STARTED flag will still be cleared immediately on
usb_ep_disable() and we will still run into this issue again.
Another workaround is to have the dwc3 driver retry the command after a
small delay if there's no resource error report. Only do dev_WARN after
a few times of the same failure.
Ideally, we should fix the usb_ep_disable() and have the composite
framework properly handle the "wait" for completion.
BR,
Thinh