Re: [PATCH] usb: dwc3: ep0: Fix the possible missed request for handling delay STATUS phase

From: Baolin Wang
Date: Fri Feb 17 2017 - 00:41:33 EST

On 23 January 2017 at 19:57, Felipe Balbi <balbi@xxxxxxxxxx> wrote:
> Hi,
> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes:
>> On Mon, 16 Jan 2017, Felipe Balbi wrote:
>>> > The gadget driver never calls usb_ep_queue in order to receive the next
>>> > SETUP packet; the UDC driver takes care of SETUP handling
>>> > automatically.
>>> yeah, that's another thing I'd like to change. Currently, we have no
>>> means to either try to implement device-initiated LPM without adding a
>>> ton of hacks to UDC drivers. If we require upper layers (composite.c,
>>> most of the time) to usb_ep_queue() separate requests for all 3 phases
>>> of a ctrl transfer, we can actually rely on the fact that a new SETUP
>>> phase hasn't been queued yet to trigger U3 entry.
>> I haven't given any thought to LPM.
> okay
>> However, requiring gadget drivers to request SETUP packets seems rather
>> questionable. It flies against the USB spec, which requires
> right, maybe SETUP is a bit of an overkill. DATA and STATUS, however,
> should be doable.
>> peripherals to accept SETUP packets at any time -- a device is not
>> allowed to NAK or STALL a SETUP packet (see in the USB-2 spec).
>> In fact, the hardware in UDCs probably isn't capable of doing it.
>> This means that to do what you want, the UDC driver would have to
>> accept SETUP packets at any time, and store the most recent packet
>> contents. Then, when the gadget driver submits a request, the UDC
>> driver would give it this stored data. It would also have to detect
> that's right, I missed that part.
>> and prevent a nasty race where the gadget driver tries to queue a
>> request on ep0 that is a response to an old SETUP, one that has already
>> been overwritten. I'm not even sure preventing this race would be
>> possible in your scheme.
>> The advantage to invoking the gadget driver's setup callback directly
>> from the UDC driver's interrupt handler is that the gadget driver will
>> know immediately when an old SETUP has become stale. (That's what
>> ep0_req_tag is for in f_mass_storage.) It also provides a concurrency
>> guarantee, because the driver does not re-enable UDC SETUP interrupts
>> until the handler is finished.
>>> Another detail that this helps is that PM (overall) becomes simpler as,
>>> most likely, we won't need to mess with transfer cancellation, for
>>> example.
>> System PM on a gadget is always troublesome. Even if the USB
>> connection is a wakeup source, it may not be possible to guarantee that
>> the gadget can wake up quickly enough to handle an incoming packet.
> that's true.
>>> > You are suggesting that status stage requests should not be queued
>>> > automatically by UDC drivers but instead queued explicitly by gadget
>>> > drivers. This would mean changing every UDC driver and every gadget
>>> > driver.
>>> yes, a bit of work but has been done before. One example that comes to
>>> mind is when I added ->udc_start() and ->udc_stop(). It's totally
>>> doable. We can, for instance, add a temporary
>>> "wants_explicit_ctrl_phases" flag to struct usb_gadget which, if set,
>>> will tell composite.c (or whatever) that the UDC wants explicitly queued
>>> ctrl phases.
>> The term used in the USB spec is "stage", not "phase". "Phase" refers
>> to the packets making up a single transaction: token, data, and
>> handshake.
>> Also, data stages are already explicit. So your temporary flag might
>> better be called "wants_explicit_status_stages".
> I stand corrected ;-)
>>> Then add support for that to each UDC and set the flag. Once all are
>>> converted, add one extra patch to remove the flag and the legacy
>>> code. This has, of course, the draw back of increasing complexity until
>>> everything is converted over; but if it's all done in a single series, I
>>> can't see any problems with that.
>>> > Also, it won't fix the race that Baolin Wang found. The setup routine
>>> well, it will help... see below.
>>> > is always called in interrupt context, so it can't sleep. Doing
>>> > anything non-trivial will require a separate task, and it's possible
>>> > that this task will try to enqueue the data-stage or status-stage
>>> > request before the UDC driver is ready to handle it (for example,
>>> > before or shortly after the setup routine returns).
>>> >
>>> > To work properly, the UDC driver must be able to accept a request for
>>> > ep0 any time after it invokes the setup callback -- either before the
>>> > callback returns or after.
>>> Right, all UDCs are *already* required to support this case anyway
>>> because of USB_GADGET_DELAYED_STATUS. There was a bug in DWC3, sure, but
>>> it was already required to support this case.
>>> By removing USB_GADGET_DELAYED_STATUS altogether and making phases more
>>> explict, we enforce this requirement and it'll be much easier to test
>>> for it IMO.
>> Okay, I can see the point of requiring explicit status requests.
>> Implementing it will be a little tricky, because right now some status
>> requests already are explicit (those for length-0 OUT transfers) while
>> others are implicit.
> exactly. And that's source of issues for every new UDC driver we get.
>> (One possible approach would be to have the setup routine return
>> different values for explicit and implicit status stages -- for
>> example, return 1 if it wants to submit an explicit status request.
>> That wouldn't be very different from the current
> not really, no. The idea was for composite.c and/or functions to support
> both methods (temporarily) and use "gadget->wants_explicit_stages" to
> explicitly queue DATA and STATUS. That would mean that f_mass_storage
> wouldn't have to return DELAYED_STATUS if
> (gadget->wants_explicit_stages).
> After all UDCs are converted over and set wants_explicit_stages (which
> should all be done in a single series), then we get rid of the flag and
> the older method of DELAYED_STATUS.

(Sorry for late reply due to my holiday)
I also met the problem pointed by Alan, from my test, I still want to
need one return value to indicate if it wants to submit an explicit
status request. Think about the Control-IN with a data stage, we can
not get the STATUS phase request from usb_ep_queue() call, and we need
to handle this STATUS phase request in dwc3_ep0_xfernotready(). But
Control-OUT will get one 0-length IN request for the status stage from
usb_ep_queue(), so we need one return value from setup routine to
distinguish these in dwc3_ep0_xfernotready(), or we can not handle
status request correctly. Maybe I missed something else.

>> On the other hand, I am very doubtful about requiring explicit setup
>> requests.
> right, me too ;-)

So do you suggest me continue to try to do this? Thanks.

Best Regards