Re: [PATCH v3 5/5] usb: xhci: Handle USB transaction error on address command

From: Lu Baolu
Date: Tue Aug 15 2017 - 22:15:58 EST


Hi,

On 08/15/2017 07:30 PM, Mathias Nyman wrote:
> On 11.08.2017 05:41, Lu Baolu wrote:
>> Xhci driver handles USB transaction errors on transfer events,
>> but transaction errors are possible on address device command
>> completion events as well.
>>
>> The xHCI specification (section 4.6.5) says: A USB Transaction
>> Error Completion Code for an Address Device Command may be due
>> to a Stall response from a device. Software should issue a Disable
>> Slot Command for the Device Slot then an Enable Slot Command to
>> recover from this error.
>>
>> This patch handles USB transaction errors on address command
>> completion events. The related discussion threads can be found
>> through below links.
>>
>> http://marc.info/?l=linux-usb&m=149362010728921&w=2
>> http://marc.info/?l=linux-usb&m=149252752825755&w=2
>>
>> Suggested-by: Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx>
>> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
>> ---
>> drivers/usb/host/xhci.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>> index d6b728d..95780f8 100644
>> --- a/drivers/usb/host/xhci.c
>> +++ b/drivers/usb/host/xhci.c
>> @@ -3822,6 +3822,11 @@ static int xhci_setup_device(struct usb_hcd *hcd, struct usb_device *udev,
>> break;
>> case COMP_USB_TRANSACTION_ERROR:
>> dev_warn(&udev->dev, "Device not responding to setup %s.\n", act);
>> +
>> + ret = xhci_disable_slot(xhci, udev->slot_id);
>> + if (!ret)
>> + xhci_alloc_dev(hcd, udev);
>
> Might be a xhci->mutex locking issue here,
> both xhci_setup_device() and xhci_alloc_dev() take xhci->mutex
>

I will apply xhci->mutex in this patch for code consistency, but I think
we can drop xhci->mutex (in a separated patch) anyway.

xhci->mutex was introduced to protect two race sources of xhci->slot_id
and xhci->addr_dev by below commit:

commit a00918d0521df1c7a2ec9143142a3ea998c8526d
Author: Chris Bainbridge <chris.bainbridge@xxxxxxxxx>
Date: Tue May 19 16:30:51 2015 +0300

usb: host: xhci: add mutex for non-thread-safe data

Regression in commit 638139eb95d2 ("usb: hub: allow to process more usb
hub events in parallel")

The regression resulted in intermittent failure to initialise a 10-port
hub (with three internal VL812 4-port hub controllers) on boot, with a
failure rate of around 8%, due to multiple race conditions when
accessing addr_dev and slot_id in struct xhci_hcd.

This regression also exposed a problem with xhci_setup_device, which
"should be protected by the usb_address0_mutex" but no longer is due to

commit 6fecd4f2a58c ("USB: separate usb_address0 mutexes for each bus")

With separate buses (and locks) it is no longer the case that a single
lock will protect xhci_setup_device from accesses by two parallel
threads processing events on the two buses.

Fix this by adding a mutex to protect addr_dev and slot_id in struct
xhci_hcd, and by making the assignment of slot_id atomic.

[--cut---]

We have already removed these two race sources after that by below
two commits:

c2d3d49 usb: xhci: move slot_id from xhci_hcd to xhci_command structure
87e44f2 usb: xhci: remove the use of xhci->addr_dev

So we don't need xhci->mutex any more. I will try to do this in a separated
patch with more tests. For now, I will add xhci->mutex for code consistency.

Best regards,
Lu Baolu