Re: [PATCH v2] usb: dwc3: gadget: Prevent EPs resource conflict during StartTransfer

From: Selvarasu Ganesan

Date: Mon Mar 02 2026 - 00:58:39 EST



On 2/27/2026 12:11 AM, Thinh Nguyen wrote:
> On Thu, Feb 26, 2026, Selvarasu Ganesan wrote:
>> On 12/12/2025 6:51 AM, Thinh Nguyen wrote:
>>> On Thu, Dec 11, 2025, Selvarasu Ganesan wrote:
>>>> On 12/5/2025 6:48 AM, Thinh Nguyen wrote:
>>>>>> I was hoping that the dwc3_gadget_ep_queue() won't come early to run
>>>>>> into this scenario. What I've provided will only mitigate and will not
>>>>>> resolve for all cases. It seems adding more checks in dwc3 will be
>>>>>> more messy.
>>>> Hi Thinh,
>>>>
>>>>
>>>> Thank you for the insightful comments. I agree that adding more checks
>>>> directly in the dwc3 driver would be messy, and a comprehensive rework
>>>> of the dwc3 ep disable would ultimately be the cleaner solution.
>>>>
>>>> In the meantime, Introducing additional checks for
>>>> DWC3_EP_TRANSFER_STARTED in dwc3 driver is the most practical way to
>>>> unblock the current issue while we work toward that longer‑term fix.
>>>> We have applied the patches and performed additional testing, no
>>>> regressions or new issues were observed.
>>>>
>>>> Could you please confirm whether below interim fix is acceptable along
>>>> with your proposed earlier patch for unblocking the current development
>>>> flow?
>>>>
>>>>
>>>> Patch 2: usb: dwc3: protect dep->flags from concurrent modify in
>>>> dwc3_gadget_ep_disable
>>>> =======================================================================================
>>>>
>>>> Subject: [PATCH] usb: dwc3: protect dep->flags from concurrent modify in
>>>> dwc3_gadget_ep_disable
>>>> The below warnings in `dwc3_gadget_ep_queue` observed during the RNDIS
>>>> enable/disable test is caused by a race between `dwc3_gadget_ep_disable`
>>>> and `dwc3_gadget_ep_queue`. Both functions manipulate `dep->flags`, and
>>>> the lock released temporarily by `dwc3_gadget_giveback` (called from
>>>> `dwc3_gadget_ep_disable`) can be acquired by `dwc3_gadget_ep_queue`
>>>> before `dwc3_gadget_ep_disable` has finished. This leads to an
>>>> inconsistent state of the `DWC3_EP_TRANSFER_STARTED` dep->flag.
>>>>
>>>> To fix this issue by add a condition check when masking `dep->flags`
>>>> in `dwc3_gadget_ep_disable` to ensure the `DWC3_EP_TRANSFER_STARTED`
>>>> flag is not cleared when it is actually set. This prevents the spurious
>>>> warning and eliminates the race.
>>>>
>>>> Thread#1:
>>>> dwc3_gadget_ep_disable
>>>>   ->__dwc3_gadget_ep_disable
>>>>    ->dwc3_remove_requests
>>>>     ->dwc3_stop_active_transfer
>>>>      ->__dwc3_stop_active_transfer
>>>>       -> dwc3_send_gadget_ep_cmd (cmd =DWC3_DEPCMD_ENDTRANSFER)
>>>>        ->if(!interrupt)dep->flags &= ~DWC3_EP_TRANSFER_STARTED;
>>>>         ->dwc3_gadget_giveback
>>>>          ->spin_unlock(&dwc->lock)
>>>>            ...
>>>>            While Thread#1 is still running, Thread#2 starts:
>>>>
>>>> Thread#2:
>>>> usb_ep_queue
>>>>   ->dwc3_gadget_ep_queue
>>>>    ->__dwc3_gadget_kick_transfer
>>>>     -> starting = !(dep->flags & DWC3_EP_TRANSFER_STARTED);
>>>>      ->if(starting)
>>>>        ->dwc3_send_gadget_ep_cmd (cmd = DWC3_DEPCMD_STARTTRANSFER)
>>>>         ->dep->flags |= DWC3_EP_TRANSFER_STARTED;
>>>>           ...
>>>>            ->__dwc3_gadget_ep_disable
>>>>             ->mask = DWC3_EP_TXFIFO_RESIZED |DWC3_EP_RESOURCE_ALLOCATED;
>>>>              ->dep->flags &= mask; --> // Possible of clears
>>>>                  DWC3_EP_TRANSFER_STARTED flag as well without
>>>>                  sending DWC3_DEPCMD_ENDTRANSFER
>>>>
>>>>  ------------[ cut here ]------------
>>>>   dwc3 13200000.dwc3: No resource for ep1in
>>>>   WARNING: CPU: 7 PID: 1748 at drivers/usb/dwc3/gadget.c:398
>>>> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>>>>   pc : dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>>>>   lr : dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>>>>   Call trace:
>>>>     dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>>>>     __dwc3_gadget_kick_transfer+0x2ec/0x3f4
>>>>     dwc3_gadget_ep_queue+0x140/0x1f0
>>>>     usb_ep_queue+0x60/0xec
>>>>     mp_tx_task+0x100/0x134
>>>>     mp_tx_timeout+0xd0/0x1e0
>>>>     __hrtimer_run_queues+0x130/0x318
>>>>     hrtimer_interrupt+0xe8/0x340
>>>>     exynos_mct_comp_isr+0x58/0x80
>>>>     __handle_irq_event_percpu+0xcc/0x25c
>>>>     handle_irq_event+0x40/0x9c
>>>>     handle_fasteoi_irq+0x154/0x2c8
>>>>     generic_handle_domain_irq+0x58/0x80
>>>>     gic_handle_irq+0x48/0x104
>>>>     call_on_irq_stack+0x3c/0x50
>>>>     do_interrupt_handler+0x4c/0x84
>>>>     el1_interrupt+0x34/0x58
>>>>     el1h_64_irq_handler+0x18/0x24
>>>>     el1h_64_irq+0x68/0x6c
>>>>
>>>> Change-Id: Ib6a77ce5130e25d0162f72d0e52c845dbb1d18f5
>>>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx>
>>>> ---
>>>>  drivers/usb/dwc3/gadget.c | 16 ++++++++++++++++
>>>>  1 file changed, 16 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>> index b42d225b67408..1dc5798072120 100644
>>>> --- a/drivers/usb/dwc3/gadget.c
>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>> @@ -1051,6 +1051,22 @@ static int __dwc3_gadget_ep_disable(struct
>>>> dwc3_ep *dep)
>>>>       */
>>>>      if (dep->flags & DWC3_EP_DELAY_STOP)
>>>>          mask |= (DWC3_EP_DELAY_STOP | DWC3_EP_TRANSFER_STARTED);
>>>> +
>>>> +    /*
>>>> +     * When dwc3_gadget_ep_disable() calls dwc3_gadget_giveback(),
>>>> +     * the  dwc->lock is temporarily released.  If dwc3_gadget_ep_queue()
>>>> +     * runs in that window it may set the DWC3_EP_TRANSFER_STARTED flag as
>>>> +     * part of dwc3_send_gadget_ep_cmd. The original code cleared the flag
>>>> +     * unconditionally, which could overwrite the concurrent modification.
>>>> +     *
>>>> +     * The added check ensures the DWC3_EP_TRANSFER_STARTED flag is only
>>>> +     * cleared if it is not set already, preserving the state updated
>>>> by the
>>>> +     * concurrent ep_queue path and eliminating the EP resource conflict
>>>> +     * warning.
>>>> +     */
>>> We need to explain the underlining problem here and in the commit
>>> message. The function usb_ep_disable() is expected be used interrupt
>>> context, and it's being used in interrupt context in the composite
>>> framework. There's no wait for flushing of endpoint is handled before
>>> usb_ep_disable completes.
>>>
>>> We are adding a temporary workaround to handle the endpoint
>>> reconfiguration and restart before the flushing completed.
>>>
>>>> +    if (dep->flags & DWC3_EP_TRANSFER_STARTED)
>>>> +        mask |= DWC3_EP_TRANSFER_STARTED;
>>>> +
>>>>      dep->flags &= mask;
>>>>
>>>>      /* Clear out the ep descriptors for non-ep0 */
>>>> --
>>>>
>>>> 2.31.1
>>>>
>>>>
>>> For your case, it may work because the endpoint is probably reconfigured
>>> to be the same in usb_ep_enable(). If we reconfigure the endpoint before
>>> the endpoint is stopped, the behavior is underfined.
>>>
>>> You can create the patches and Cc stable. However, I would not add the
>>> "Fixes" tag since they (IMHO) are not really fixes. May also need to
>>> note that under the "---" in the commit explain why there's no Fixes tag
>>> also.
>>>
>>> Thanks,
>>> Thinh
>>
>> Hi Thinh,
>>
>> This is a followup regarding the temporary workaround patches.
>>
>> In v3, I will incorporate the actual underlying issue into the commit
>> message proper, and include a note under the '---' separator explaining
>> why the Fixes tag was not added.
>>
>> Please review the updated patches in below and let me know if you have
>> any concerns.
>>
>>
>>
>> Subject: [PATCH v3 1/2] usb: dwc3: gadget: Prevent EPs resource conflict
>> during StartTransfer
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset=UTF-8
>> Content-Transfer-Encoding: 8bit
>>
>> The below “No resource for ep” warning appears when a StartTransfer
>> command is issued for bulk or interrupt endpoints in
>> `dwc3_gadget_ep_enable` while a previous StartTransfer on the same
>> endpoint is still in progress. The gadget functions drivers can invoke
>> `usb_ep_enable` (which triggers a new StartTransfer command) before the
>> earlier transfer has completed. Because the previous StartTransfer is
>> still active, `dwc3_gadget_ep_disable` can skip the required
>> `EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
>> resources are busy for previous StartTransfer and warning ("No resource
>> for ep") from dwc3 driver.
>>
>> The underlying framework issue is that `usb_ep_disable()` is expected to
>> complete pending requests before returning, but is allowed to be called
>> from interrupt context where sleeping to wait for completion is not
>> possible.
>>
>> Add a temporary workaround to handle the endpoint reconfiguration and
>> restart before the flushing completed. Specifically, a check is added to
>> `dwc3_gadget_ep_enable` that checks the `DWC3_EP_TRANSFER_STARTED`
>> flag before issuing a new StartTransfer. By preventing a second
>> StartTransfer on an already busy endpoint, the resource conflict is
>> eliminated, the warning disappears, and potential kernel panics caused
>> by `panic_on_warn` are avoided.
>>
>> ------------[ cut here ]------------
>> dwc3 13200000.dwc3: No resource for ep1out
>> WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398
>> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>> Call trace:
>> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>> __dwc3_gadget_ep_enable+0x490/0x7c0
>> dwc3_gadget_ep_enable+0x6c/0xe4
>> usb_ep_enable+0x5c/0x15c
>> mp_eth_stop+0xd4/0x11c
>> __dev_close_many+0x160/0x1c8
>> __dev_change_flags+0xfc/0x220
>> dev_change_flags+0x24/0x70
>> devinet_ioctl+0x434/0x524
>> inet_ioctl+0xa8/0x224
>> sock_do_ioctl+0x74/0x128
>> sock_ioctl+0x3bc/0x468
>> __arm64_sys_ioctl+0xa8/0xe4
>> invoke_syscall+0x58/0x10c
>> el0_svc_common+0xa8/0xdc
>> do_el0_svc+0x1c/0x28
>> el0_svc+0x38/0x88
>> el0t_64_sync_handler+0x70/0xbc
>> el0t_64_sync+0x1a8/0x1ac
>>
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx>
>> ---
>>
>> Note: No Fixes tag is added because this is a workaround for the
>> gadget framework issue where the gadget framework calls usb_ep_disable()
>> in interrupt context without ensuring endpoint flushing completes.
>> A proper fix requires refactoring the framework to make sure
>> usb_ep_disable is invoked in process context.
>> ---
>> drivers/usb/dwc3/gadget.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 502d54ce13bc..949a9e6b176a 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -951,8 +951,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep
>> *dep, unsigned int action)
>> * Issue StartTransfer here with no-op TRB so we can always rely on No
>> * Response Update Transfer command.
>> */
>> - if (usb_endpoint_xfer_bulk(desc) ||
>> - usb_endpoint_xfer_int(desc)) {
>> + if ((usb_endpoint_xfer_bulk(desc) ||
>> + usb_endpoint_xfer_int(desc)) &&
>> + !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
>> struct dwc3_gadget_ep_cmd_params params;
>> struct dwc3_trb *trb;
>> dma_addr_t trb_dma;
>> --
>> 2.34.1
>>
>>
>>
>> Subject: [PATCH v3 2/2] usb: dwc3: gadget: Protect endpoint flags during
>>  concurrent ep disable and queue
>>
>> A race condition exists between dwc3_gadget_ep_disable() and
>> dwc3_gadget_ep_queue() when manipulating dep->flags. The underlying
>> framework issue is that usb_ep_disable() is expected be used interrupt
>> context, and it's being used in interrupt context in the composite
>> framework. There's no wait for flushing of endpoint is handled before
>> usb_ep_disable completes.
>>
>> In the race scenario: when dwc3_gadget_ep_disable() calls
>> dwc3_gadget_giveback(), the dwc->lock is temporarily released. If
>> dwc3_gadget_ep_queue() runs in that window, it set the
>> DWC3_EP_TRANSFER_STARTED flag via dwc3_send_gadget_ep_cmd(). When
>> ep_disable resumes, it unconditionally clears all flags except those
>> explicitly masked, potentially clearing DWC3_EP_TRANSFER_STARTED even
>> though a new transfer has started. This leads to "No resource ep"
>> warnings on subsequent StartTransfer attempts.
>>
>> As a temporary workaround for the framework limitation, add a condition
>> check when masking `dep->flags` in `dwc3_gadget_ep_disable` to ensure
>> the `DWC3_EP_TRANSFER_STARTED` flag is not cleared when it is actually
>> set. This prevents the spurious warning and eliminates the race.
>>
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx>
>> ---
>>
>> Note: No Fixes tag is added because this is a workaround for the
>> gadget framework issue where the gadget framework calls usb_ep_disable()
>> in interrupt context without ensuring endpoint flushing completes.
>> A proper fix requires refactoring the framework to make sure
>> usb_ep_disable is invoked in process context.
>> ---
>>  drivers/usb/dwc3/gadget.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 949a9e6b176a..1b16d103d94e 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -1078,6 +1078,23 @@ static int __dwc3_gadget_ep_disable(struct
>> dwc3_ep *dep)
>>          */
>>         if (dep->flags & DWC3_EP_DELAY_STOP)
>>                 mask |= (DWC3_EP_DELAY_STOP | DWC3_EP_TRANSFER_STARTED);
>> +
>> +       /*
>> +        * When dwc3_gadget_ep_disable() calls dwc3_gadget_giveback(),
>> +        * the dwc->lock is temporarily released. If dwc3_gadget_ep_queue()
>> +        * runs in that window it may set the DWC3_EP_TRANSFER_STARTED
>> flag as
>> +        * part of dwc3_send_gadget_ep_cmd. The original code cleared
>> the flag
>> +        * unconditionally in the mask operation, which could overwrite the
>> +        * concurrent modification.
>> +        *
>> +        * As a workaround for the interrupt context constraint where we
>> cannot
>> +        * wait for endpoint flushing, preserve the DWC3_EP_TRANSFER_STARTED
>> +        * flag if it is set, avoiding resource conflicts until the
>> framework
>> +        * is fixed to properly synchronize endpoint lifecycle management.
>> +        */
>> +       if (dep->flags & DWC3_EP_TRANSFER_STARTED)
>> +               mask |= DWC3_EP_TRANSFER_STARTED;
>> +
>>         dep->flags &= mask;
>>
>>         /* Clear out the ep descriptors for non-ep0 */
>> --
>> 2.34.1
>>
>>
> I think these 2 patches should be combined together into a single patch,
> and it looks good to me.

Hi Thinh,

Thank you for your feedback. I have posted the v3 patchset, which
combines the two patches into a single one.

https://lore.kernel.org/linux-usb/20260227121236.963-1-selvarasu.g@xxxxxxxxxxx/

Thanks,
Selva


>
> Thanks,
> Thinh