Re: [PATCH v2] usb: dwc3: gadget: Prevent EPs resource conflict during StartTransfer

From: Thinh Nguyen

Date: Wed Nov 19 2025 - 21:07:46 EST


On Tue, Nov 18, 2025, Alan Stern wrote:
> On Wed, Nov 19, 2025 at 01:49:12AM +0000, Thinh Nguyen wrote:
> > On Mon, Nov 17, 2025, Alan Stern wrote:
> > > > Hi Alan,
> > > >
> > > > Can you help give your opinion on this?
> > >
> > > Well, I think the change to the API was made because drivers _were_
> > > calling these routines in interrupt context. That's what the commit's
> > > description says, anyway.
> > >
> > > One way out of the problem would be to change the kerneldoc for
> > > usb_ep_disable(). Instead of saying that pending requests will complete
> > > before the all returns, say that the the requests will be marked for
> > > cancellation (with -ESHUTDOWN) before the call returns, but the actual
> > > completions might happen asynchronously later on.
> >
> > The burden of synchronization would be shifted to the gadget drivers.
> > The problem with this is that gadget drivers may modify the requests
> > after usb_ep_disable() when it should not (e.g. the controller may still
> > be processing the buffer). Also, gadget drivers shouldn't call
> > usb_ep_enabled() until the requests are returned.
>
> No, they probably shouldn't, although I don't know if that would
> actually cause any trouble. It's not a good idea, in any case --
> particularly if the drivers want to re-use the same requests as before.

Right.

>
> The problem is that function drivers' ->set_alt() callbacks are expected
> to do two things: disable all the endpoints from the old altsetting and
> enable all the endpoints in the new altsetting. There's no way for any
> part of the gadget core to intervene between those things (for instance,
> to wait for requests to complete).
>
> > > The difficulty comes when a gadget driver has to handle a Set-Interface
> > > request, or Set-Config for the same configuration. The endpoints for
> > > the old altsetting/config have to be disabled and then the endpoints for
> > > the new altsetting/config have to be enabled, all while managing any
> >
> > Right.
> >
> > > pending requests. I don't know how various function drivers handle
> > > this, just that f_mass_storage is very careful about taking care of
> > > everything in a separate kernel thread that explicitly dequeues the
> > > pending requests and flushes the endpoints. In fact, this scenario was
> > > the whole reason for inventing the DELAYED_STATUS mechanism, because it
> > > was impossible to do all the necessary work within the callback routine
> > > for a control-request interrupt handler.
> > >
> >
> > If we want to keep usb_ep_disable in interrupt context, should we revise
> > the wording such that gadget drivers must ensure pending requests are
> > dequeued before calling usb_ep_disable()? That requests are expected to
> > be given back before usb_ep_disable().
> >
> > Or perhaps revert the commit above (commit b0d5d2a71641), fix the dwc3
> > driver for that, and gadget drivers need to follow the original
> > requirement.
>
> Function drivers would have to go to great lengths to guarantee that
> requests had completed before the endpoint is re-enabled. Right now
> their ->set_alt() callback routines are designed to run in interrupt
> context; they can't afford to wait for requests to complete.

Why is ->set_alt() designed for interrupt context? We can't expect
requests to be completed before usb_ep_disable() completes _and_ also
expect usb_ep_disable() be able to be called in interrupt context.

>
> The easiest way out is for usb_ep_disable() to do what the kerneldoc
> says: ensure that pending requests do complete before it returns. Can
> dwc3 do this? (And what if at some time in the future we want to start

The dwc3 can do that, but we need to note that usb_ep_disable() must be
executed in process context and might sleep. I suspect we may run into
some issues from some function drivers that expected usb_ep_disable() to
be executable in interrupt context.

> using an asynchronous bottom half for request completions, like usbcore
> does for URBs?)

Which one are you referring to? From what I see, even the host side
expected ->endpoint_disable to be executed in process context.

Perhaps we can introduce endpoint_flush() on gadget side for
synchronization if we want to keep usb_ep_disable() to be asynchronous?

>
> Let's face it; the situation is a mess.
>

Glad you're here to help with the mess :)

Thanks,
Thinh