Re: Dell Inspiron 5558/0VNM2T hangs at resume from suspend when USB 3 is enabled

From: Diego Viola
Date: Fri Mar 24 2017 - 12:25:57 EST


On Thu, Mar 23, 2017 at 2:12 PM, Diego Viola <diego.viola@xxxxxxxxx> wrote:
> On Thu, Mar 23, 2017 at 2:02 PM, Mathias Nyman
> <mathias.nyman@xxxxxxxxxxxxxxx> wrote:
>> On 22.03.2017 19:51, Mathias Nyman wrote:
>>>
>>> On 22.03.2017 00:52, Diego Viola wrote:
>>>>
>>>> On Tue, Mar 21, 2017 at 12:29 PM, Diego Viola <diego.viola@xxxxxxxxx>
>>>> wrote:
>>>>>
>>>>> On Tue, Mar 21, 2017 at 10:04 AM, Diego Viola <diego.viola@xxxxxxxxx>
>>>>> wrote:
>>>>>>
>>>>>> On Mon, Mar 20, 2017 at 8:15 PM, Diego Viola <diego.viola@xxxxxxxxx>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Mon, Mar 20, 2017 at 3:27 PM, Diego Viola <diego.viola@xxxxxxxxx>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Mon, Mar 20, 2017 at 1:32 PM, Mathias Nyman
>>>>>>>> <mathias.nyman@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On 20.03.2017 17:39, Diego Viola wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 20, 2017 at 11:21 AM, Mathias Nyman
>>>>>>>>>> <mathias.nyman@xxxxxxxxxxxxxxx> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 19.03.2017 23:29, Diego Viola wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Still a problem with 4.11.0-rc2-ARCH+
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> xhci tracing can be added with:
>>>>>>>>>>>
>>>>>>>>>>> mount -t debugfs none /sys/kernel/debug
>>>>>>>>>>> echo xhci-hcd >> /sys/kernel/debug/tracing/set_event
>>>>
>>>>
>>>> Here's the log I was able to obtain today, dmesg + ftrace at the time
>>>> of the crash:
>>>>
>>>> https://bugzilla.kernel.org/attachment.cgi?id=255419
>>>>
>>>> USB keyboard and mouse was plugged when I reproduced this.
>>>>
>>>> Please let me know if you need more info.
>>>>
>>>
>>> Thanks, I'm looking at the logs and so far the most suspicious looking
>>> entry is:
>>>
>>> [ 257.060941] rtsx_usb-254 0.... 119946155us : xhci_urb_enqueue:
>>> ep1out-bulk: urb ffff880105a93300 pipe 3221259520 length 0/12 sgs 0/0 stream
>>> 0 flags 00010000
>>> [ 257.063601] rtsx_usb-254 0.... 119946162us : xhci_urb_enqueue:
>>> ep0out-control: urb ffff880105a93300 pipe 2147484928 length 0/0 sgs 0/0
>>> stream 0 flags 00100000
>>>
>>> It enqueues the same URB, without ever giving it back or actually queuing
>>> any trbs for
>>> the urb, wel,l it might just fail to enqueue it in the first place.
>>>
>>> I need to search for a URB that has been dequeued but never given back in
>>> the trace
>>
>>
>> Ok, found a much more likely candidate:
>>
>> [ 258.004078] kworker/-544 0d..1 121599183us : xhci_urb_dequeue:
>> ep1out-bulk: urb ffff880105a930c0 pipe 3221259520...
>>
>> We try to kill this URB "ffff880105a930c0", twice, and its never given back.
>> Trace is missing "xhci_dbg_cancel_urb: Cancel URB..." entry in log after
>> xhci_urb_dequeue, so it never got added to the list for cancellation in xhci
>> driver.
>>
>> xhci_urb_dequeue() has one place where it just returns an error without
>> giving back the urb or queuing it for cancellation.
>> This is in my opinion a bug in xhci_urb_dequeue()
>>
>> rtsx_usb_ms is a good test for usb, it seems to be constantly queuing urbs
>> at all
>> inappropriate times.
>>
>> If I write a patch can you try it out?
>
> Yes.
>
>>
>> -Mathias
>>
>>
>>
>
> Thanks,
> Diego

Hi Mathias,

I tested your patch with Linux 4.11-rc3 and can confirm that it solves
the problem.

I've tested suspend and resume with i3lock 150 times and it works.

Thank you, I appreciate it a lot.

Diego