[PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI commmand timeout

From: Rajesh Bhagat
Date: Fri Mar 18 2016 - 03:16:06 EST


We are facing issue while performing the system resume operation from STR
where XHCI is going to indefinite hang/sleep state due to
wait_for_completion API called in function xhci_alloc_dev for command
TRB_ENABLE_SLOT which never completes.

Now, xhci_handle_command_timeout function is called and prints
"Command timeout" message but never calls complete API for above
TRB_ENABLE_SLOT command as xhci_abort_cmd_ring is successful.

Solution to above problem is:
1. calling xhci_cleanup_command_queue API even if xhci_abort_cmd_ring
is successful or not.
2. checking the status of reset_device in usb core code.

Before Fix:
root@phoenix:~# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: suspend of devices complete after 103.144 msecs
PM: late suspend of devices complete after 1.503 msecs
PM: noirq suspend of devices complete after 1.220 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
Retrying again to check for CPU kill
CPU1 killed.
Enabling non-boot CPUs ...
CPU1 is up
PM: noirq resume of devices complete after 1.996 msecs
PM: early resume of devices complete after 1.152 msecs
usb usb1: root hub lost power or was reset
usb usb2: root hub lost power or was reset
----- <<hangs indefinitely>> --------------

After Fix:
root@phoenix:~# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: suspend of devices complete after 103.086 msecs
PM: late suspend of devices complete after 1.517 msecs
PM: noirq suspend of devices complete after 1.217 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
Retrying again to check for CPU kill
CPU1 killed.
Enabling non-boot CPUs ...
CPU1 is up
PM: noirq resume of devices complete after 1.991 msecs
PM: early resume of devices complete after 1.239 msecs
usb usb1: root hub lost power or was reset
usb usb2: root hub lost power or was reset
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
PM: resume of devices complete after 75567.769 msecs
Restarting tasks ...
usb 1-1: USB disconnect, device number 2
usb 2-1: USB disconnect, device number 2
usb 2-1.1: USB disconnect, device number 3
done.
root@phoenix:~#

Signed-off-by: Sriram Dash <sriram.dash@xxxxxxx>
Signed-off-by: Rajesh Bhagat <rajesh.bhagat@xxxxxxx>
---
drivers/usb/core/hub.c | 12 ++++++++----
drivers/usb/host/xhci-ring.c | 2 +-
2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 38cc4ba..c906018 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2897,10 +2897,14 @@ done:
/* The xHC may think the device is already reset,
* so ignore the status.
*/
- if (hcd->driver->reset_device)
- hcd->driver->reset_device(hcd, udev);
-
- usb_set_device_state(udev, USB_STATE_DEFAULT);
+ if (hcd->driver->reset_device) {
+ status = hcd->driver->reset_device(hcd, udev);
+ if (status == 0)
+ usb_set_device_state(udev, USB_STATE_DEFAULT);
+ else
+ usb_set_device_state(udev, USB_STATE_NOTATTACHED);
+ } else
+ usb_set_device_state(udev, USB_STATE_DEFAULT);
}
} else {
if (udev)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 7cf6621..be8fd61 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1272,9 +1272,9 @@ void xhci_handle_command_timeout(unsigned long data)
spin_unlock_irqrestore(&xhci->lock, flags);
xhci_dbg(xhci, "Command timeout\n");
ret = xhci_abort_cmd_ring(xhci);
+ xhci_cleanup_command_queue(xhci);
if (unlikely(ret == -ESHUTDOWN)) {
xhci_err(xhci, "Abort command ring failed\n");
- xhci_cleanup_command_queue(xhci);
usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
xhci_dbg(xhci, "xHCI host controller is dead.\n");
}
--
1.7.7.4