Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

From: Felipe Balbi
Date: Mon Mar 05 2018 - 06:26:43 EST



Hi,

Roger Quadros <rogerq@xxxxxx> writes:
> On 05/03/18 13:06, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Baolin Wang <baolin.wang@xxxxxxxxxx> writes:
>>>>> Roger Quadros <rogerq@xxxxxx> writes:
>>>>>>> Roger Quadros <rogerq@xxxxxx> writes:
>>>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>>>> after which dual-role switching doesn't work.
>>>>>>>>
>>>>>>>> On dra7-evm's dual-role port,
>>>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>>>> - suspend to mem
>>>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>>>> - resume system
>>>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>>
>>>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>>>
>>>>>>>> Signed-off-by: Roger Quadros <rogerq@xxxxxx>
>>>>>>>> ---
>>>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>>>> 1 file changed, 14 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>>> index 2bda4eb..0a360da 100644
>>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>>>
>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>>> {
>>>>>>>> + int epnum;
>>>>>>>> + unsigned long flags;
>>>>>>>> +
>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>>> +
>>>>>>>> + if (!dep)
>>>>>>>> + continue;
>>>>>>>> +
>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>>> + }
>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>>> +
>>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>>
>>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>>
>>>>>>
>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>>
>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>>
>>>>> indeed. Iterating twice over the entire endpoint list seems
>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>>> that's essentially what this patch will do, right? If you clear the flag
>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>>> will do nothing. Might as well remove it.
>>>>>
>>>>
>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>>
>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>>
>>> I don't think so. It can not mask the endpoint events, please check
>>> the events which will be masked in DEVTEN register. The reason why we
>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>>> than 100us, but now we may have freed the gadget irq which will cause
>>> crash.
>>
>> We could mask command complete events as soon as ->udc_stop() is called,
>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
>> completely.
>
> But which bit in DEVTEN says Endpoint events are disabled?
>
>>
>> /me goes check databook
>>
>> At least on revision 2.60a of the databook, bit 10 is reserved. I wonder
>> if that's the start of all the problems. Anybody has access to older and
>> newer databook revisions so we can cross-check?
>>
>
> I can access v2.40 and v3.10 books.
>
> bit 10 is reserved on both
>
> Differences in v2.4 vs v3.10 are:
>
> bit 8 reserved vs L1SUSPEN
> bit 13 reserved vs StopOnDisconnectEn
> bit 14 reserved vs L1WKUPEVTEN

odd, at some point we lost command complete interrupt :-(

That line exists since first commit (see below), so that would mean it
existed in 1.73a (the revision the original was written against), but
vanished on 2.40a. Perhaps 2.00a still had it.

Hey John, do you know, off the top of your head, when we lost DEVTEN[10]
as mask/unmask bit for EP Command Completion IRQs?


commit 72246da40f3719af3bfd104a2365b32537c27d83
Author: Felipe Balbi <balbi@xxxxxx>
Date: Fri Aug 19 18:10:58 2011 +0300

usb: Introduce DesignWare USB3 DRD Driver

--
balbi

Attachment: signature.asc
Description: PGP signature