Re: [RFC] usb: host: xhci: Remove the watchdog timer and use command timer to watch stop endpoint command

From: Baolin Wang
Date: Thu Dec 01 2016 - 03:03:35 EST


On 1 December 2016 at 15:44, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
> Hi,
>
> On 12/01/2016 03:35 PM, Baolin Wang wrote:
>> On 1 December 2016 at 14:35, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
>>> Hi,
>>>
>>> On 12/01/2016 02:04 PM, Baolin Wang wrote:
>>>> Hi Baolu,
>>>>
>>>> On 1 December 2016 at 13:45, Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx> wrote:
>>>>> Hi,
>>>>>
>>>>> On 11/30/2016 05:02 PM, Baolin Wang wrote:
>>>>>> If the hardware never responds to the stop endpoint command, the
>>>>>> URBs will never be completed, and we might hang the USB subsystem.
>>>>>> The original watchdog timer is used to watch if one stop endpoint
>>>>>> command is timeout, if timeout, then the watchdog timer will set
>>>>>> XHCI_STATE_DYING, try to halt the xHCI host, and give back all
>>>>>> pending URBs.
>>>>>>
>>>>>> But now we already have one command timer to control command timeout,
>>>>>> thus we can also use the command timer to watch the stop endpoint
>>>>>> command, instead of one duplicate watchdog timer which need to be
>>>>>> removed.
>>>>>>
>>>>>> Meanwhile we don't need the 'stop_cmds_pending' flag to identy if
>>>>>> this is the last stop endpoint command of one endpoint. Since we
>>>>>> can make sure we only set one stop endpoint command for one endpoint
>>>>>> by 'EP_HALT_PENDING' flag in xhci_urb_dequeue() function. Thus remove
>>>>>> this flag.
>>>>> I am afraid you can't do this. "stop_cmds_pending" was added
>>>>> to fix the problem described in the comments that you want to
>>>>> remove. But I didn't find any fix of this problem in your patch.
>>>> Now we can not pending another stop endpoint command for the same one
>>>> endpoint, since will check 'EP_HALT_PENDING' flag in
>>>> xhci_urb_dequeue() function to avoid this. But after some
>>>> investigation, I think I missed the stop endpoint command in
>>>> xhci_stop_device() which did not check the 'EP_HALT_PENDING' flag,
>>>> maybe need to add 'EP_HALT_PENDING' flag checking in
>>>> xhci_stop_device() function. DId I miss something else? Thanks.
>>> Consider below three threads running on different CPUs at the same time.
>>>
>>> Thread A: xhci_handle_cmd_stop_ep() --- in interrupt handler
>>> Thread B: xhci_stop_endpoint_command_timeout() --- timer expired
>>> Thread C: xhci_urb_dequeue --- called by usb core
>>>
>>> They are serialized by xhci->lock. Let's consider below sequence:
>>>
>>> Thread A:
>>> - delete xhci->cmd_timer), but will fail due to Thread B.
>>> - clear EP_HALT_PENDING bit
>>>
>>> Thread B:
>>> - halt the host controller
>>>
>>> Thread C:
>>> - set EP_HALT_PENDING bit
>>> - enqueue another stop endpoint command
>>> - add the timer back
>> Ah, thanks for you comments.
>> If thread B halted the host, then the thread C xhci_urb_dequeue() will
>> check host state failed, then just return and can not set another stop
>> endpoint command.
>
> Not so simple. Thread B will release xhci->lock before xhci_halt().

Yes.

>
>> But from your example, I think your meaning is we
>> should not halt the host by thread B, since we have handled the stop
>> endpoint command event.
>>
>> So I think we need to check the xhci command of this stop endpoint
>> command in xhci_stop_endpoint_command_timeout(), if the xhci command
>> of this stop endpoint command is not in the command list (which means
>> the stop endpoint command event has been handled), then just return
>> and do not halt the controller. Right?
>>
>
> I'd like suggest you to put this change into a separated patch.
> It's actually a different topic from the main purpose of this patch.

OK, I will. Thanks for your comments.

--
Baolin.wang
Best Regards