Re: [PATCH] usb: dwc3: ep0: Fix the possible missed request for handling delay STATUS phase

From: Felipe Balbi
Date: Mon Jan 16 2017 - 07:09:08 EST



Hi,

Baolin Wang <baolin.wang@xxxxxxxxxx> writes:
> Hi,
>
> On 16 January 2017 at 19:29, Felipe Balbi <balbi@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> Baolin Wang <baolin.wang@xxxxxxxxxx> writes:
>>> Hi,
>>>
>>> On 16 January 2017 at 18:56, Felipe Balbi <balbi@xxxxxxxxxx> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Baolin Wang <baolin.wang@xxxxxxxxxx> writes:
>>>>> When handing the SETUP packet by composite_setup(), we will release the
>>>>> dwc->lock. If we get the 'USB_GADGET_DELAYED_STATUS' result from setup
>>>>> function, which means we need to delay handling the STATUS phase.
>>>>
>>>> this sentence needs a little work. Seems like it's missing some
>>>> information.
>>>>
>>>> anyway, I get that we release the lock but...
>>>>
>>>>> But during the lock release period, maybe the request for handling delay
>>>>
>>>> execution of ->setup() itself should be locked. I can see that it's only
>>>> locked for set_config() which is rather peculiar.
>>>>
>>>> What exact request you had when you triggered this? (Hint: dwc3
>>>> tracepoints print out ctrl request bytes). IIRC, only set_config() or
>>>> f->set_alt() can actually return USB_GADGET_DELAYED_STATUS.
>>>
>>> Yes, when host set configuration for mass storage driver, it can
>>> produce this issue.
>>>
>>>>
>>>> Which gadget driver were you using when you triggered this?
>>>
>>> mass storage driver. When host issues the setting config request, we
>>> will get USB_GADGET_DELAYED_STATUS result from
>>> set_config()--->fsg_set_alt(). Then the mass storage driver will issue
>>> one thread to complete the status stage by ep0_queue() (this thread
>>> may be running on another core), then if the thread issues ep0_queue()
>>> too fast before we get the dwc->lock in dwc3_ep0_delegate_req() or
>>> before we get into the STATUS stage, then we can not handle this
>>> request for the delay STATUS stage in dwc3_gadget_ep0_queue().
>>>
>>>>
>>>> Another point here is that the really robust way of fixing this is to
>>>> get rid of USB_GADGET_DELAYED_STATUS altogether and just make sure
>>>> gadget drivers know how to queue requests for all three phases of a
>>>> Control Transfer.
>>>>
>>>> A lot of code will be removed from all gadget drivers and UDC drivers
>>>> while combining all of it in a single place in composite.c.
>>>>
>>>> The reason I'm saying this is that other UDC drivers might have similar
>>>> races already but they just haven't triggered.
>>>
>>> Yes, maybe.
>>>
>>>>
>>>>> STATUS phase has been queued into list before we set 'dwc->delayed_status'
>>>>> flag or entering 'EP0_STATUS_PHASE' phase, then we will miss the chance
>>>>> to handle the STATUS phase. Thus we should check if the request for delay
>>>>> STATUS phase has been enqueued when entering 'EP0_STATUS_PHASE' phase in
>>>>> dwc3_ep0_xfernotready(), if so, we should handle it.
>>>>>
>>>>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx>
>>>>> ---
>>>>> drivers/usb/dwc3/ep0.c | 14 ++++++++++++++
>>>>> 1 file changed, 14 insertions(+)
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
>>>>> index 9bb1f85..e689ced 100644
>>>>> --- a/drivers/usb/dwc3/ep0.c
>>>>> +++ b/drivers/usb/dwc3/ep0.c
>>>>> @@ -1123,7 +1123,21 @@ static void dwc3_ep0_xfernotready(struct dwc3 *dwc,
>>>>> dwc->ep0state = EP0_STATUS_PHASE;
>>>>>
>>>>> if (dwc->delayed_status) {
>>>>> + struct dwc3_ep *dep = dwc->eps[0];
>>>>> +
>>>>> WARN_ON_ONCE(event->endpoint_number != 1);
>>>>> + /*
>>>>> + * We should handle the delay STATUS phase here if the
>>>>> + * request for handling delay STATUS has been queued
>>>>> + * into the list.
>>>>> + */
>>>>> + if (!list_empty(&dep->pending_list)) {
>>>>> + dwc->delayed_status = false;
>>>>> + usb_gadget_set_state(&dwc->gadget,
>>>>> + USB_STATE_CONFIGURED);
>>>>
>>>> Isn't this patch also changing the normal case when usb_ep_queue() comes
>>>> later? I guess list_empty() protects against that...
>>>
>>> I think it will not change other cases, we only handle the delayed
>>> status and I've tested it for a while and I did not find any problem.
>>
>> Alright, it's important enough to fix this bug. Can you also have a look
>> into dropping USB_GADGET_DELAYED_STATUS altogether? If you're too busy,
>> no issues. It'll stay in my queue.
>
> Okay, I will have a look at f_mass_storage driver to see if we can
> drop USB_GADGET_DELAYED_STATUS. Thanks.

not only mass storage. It needs to be done for all drivers. The way to
do that is to teach functions that control transfers are composed of two
or three phases. If you look at UDC drivers today, they all have
peculiarities about control transfers to handle stuff that *maybe*
gadget drivers won't handle.

What we should do here is make sure that *all* 3 phases always have a
matching usb_ep_queue() coming from the upper layers. Whether
composite.c or f_*.c handles it, that's an implementation detail. But
just to illustrate the problem, we should be able to get rid of
dwc3_ep0_out_start() and assume that the upper layer will call
usb_ep_queue() when it wants to receive a new SETUP packet.

Likewise, we should be able to assume that STATUS phase will only start
based on a usb_ep_queue() call. That way we can remove
USB_GADGET_DELAYED_STATUS altogether, because that will *always* be the
case. There will be no races to handle apart from the normal case where
XferNotReady can come before or after usb_ep_queue(), but we already
have proper handling for that too.

--
balbi

Attachment: signature.asc
Description: PGP signature