Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs
From: Dan Williams
Date: Wed Dec 11 2013 - 17:51:48 EST
On Wed, Dec 11, 2013 at 11:36 AM, Sarah Sharp
<sarah.a.sharp@xxxxxxxxxxxxxxx> wrote:
> On Wed, Dec 11, 2013 at 11:00:13AM -0800, Julius Werner wrote:
>> > I don't know what you mean by "fails". The system goes to sleep and
>> > then later on wakes up, doesn't it?
>> >
>> > Do you mean that the Jetflash device gets disconnected when the system
>> > wakes up? That's _supposed_ to happen under those circumstances.
>> > When hub_activate() sees HUB_RESET_RESUME, all child devices get
>> > disconnected except those where udev->persist_enabled is set.
>>
>> This patch was written in response to the same bug as my "usb: hub:
>> Use correct reset for wedged USB3 devices that are NOTATTACHED"
>> submission. My patch only helps when the port gets stuck in Compliance
>> Mode, but Vikas reports that he can sometimes see it stuck in Polling
>> or Recovery states as well.
>>
>> The underlying issue is a deadlock in the USB 3.0 link training state
>> machine when the host controller is unilaterally reset on resume
>> (without driving a reset on the bus).
>
> The xHCI spec requires that when the xHCI host is reset, a USB reset is
> driven down the USB 3.0 ports. If hot reset fails, the port may migrate
> to warm reset. See table 32 in the xHCI spec, in the definition of
> HCRST. It sounds like this host doesn't drive a USB reset down USB 3.0
> ports at all on host controller reset?
...although, the spec says that it does not wait for the port resets
to complete. As far as I can see re-issuing a warm reset and waiting
is the only way to guarantee the core times the recovery. Presumably
the portstatus debounce in hub_activate() mitigates this, but that
100ms is less than a full reset timeout. I have something similar in
the port power rework patches [1], but I think something like the
following (untested) is more generic, it arranges for reset_resume to
start with a warm reset if necessary.
(also attached)
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index a7c04e24ca48..30ce237569dd 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2783,8 +2783,14 @@ static int check_port_resume_type(struct
usb_device *udev,
struct usb_hub *hub, int port1,
int status, unsigned portchange, unsigned portstatus)
{
+ /* Did the port go SS.Inactive? Even if ->persist_enabled is
cleared the
+ * device won't come back until a warm reset completes
+ */
+ if (hub_port_warm_reset_required(hub, portstatus)) {
+ udev->reset_resume = 1;
+ udev->reset_resume_warm = 1;
/* Is the device still present? */
- if (status || port_is_suspended(hub, portstatus) ||
+ } else if (status || port_is_suspended(hub, portstatus) ||
!port_is_power_on(hub, portstatus) ||
!(portstatus & USB_PORT_STAT_CONNECTION)) {
if (status >= 0)
@@ -4022,7 +4028,8 @@ hub_port_init (struct usb_hub *hub, struct
usb_device *udev, int port1,
/* Reset the device; full speed may morph to high speed */
/* FIXME a USB 2.0 device may morph into SuperSpeed on reset. */
- retval = hub_port_reset(hub, port1, udev, delay, false);
+ retval = hub_port_reset(hub, port1, udev, delay,
+ udev->reset_resume_warm);
if (retval < 0) /* error or disconnect */
goto fail;
/* success, speed is known */
@@ -4730,7 +4737,8 @@ static void hub_events(void)
/* deal with port status changes */
for (i = 1; i <= hdev->maxchild; i++) {
- if (test_bit(i, hub->busy_bits))
+ if (test_bit(i, hub->busy_bits) ||
+ test_bit(i, hub->delayed_change_bits))
continue;
connect_change = test_bit(i, hub->change_bits);
wakeup_change = test_and_clear_bit(i, hub->wakeup_bits);
diff --git a/include/linux/usb.h b/include/linux/usb.h
index 7454865ad148..ff1b6fe4a0ff 100644
--- a/include/linux/usb.h
+++ b/include/linux/usb.h
@@ -572,6 +572,7 @@ struct usb_device {
unsigned do_remote_wakeup:1;
unsigned reset_resume:1;
+ unsigned reset_resume_warm:1;
unsigned port_is_suspended:1;
#endif
struct wusb_dev *wusb_dev;
>
>> The host port starts out back in
>> Rx.Detect without remembering anything about its previous state, but
>> the device is still in U3. The host detects Rx terminations, moves to
>> Polling and starts sending LFPS link training packets, but the device
>> doesn't expect those and interprets them as link problems (moving to
>> Recovery). What happens next seems to be device specific, but
>> apparently the device can end up in SS.Inactive while the host port
>> gets stuck in Polling or Recovery (or some kind of livelock between
>> those).
In testing the port power patches I see this particular device give up
on its superspeed connection if it sees too many link failures and
fallsback to connecting via its high speed connection.
>> This patch tries to warm reset all USB 3.0 ports on reset-resume
>> (after xhci_reset() was called) that had devices connected to them
>> before suspend. This seems to be the only way to ensure the devices'
>> state machines get back to a well-defined state that the host can work
>> with. I don't think this is a specific hardware bug, it's just an
>> unfortunate design flaw that the USB 3.0 spec doesn't account for a
>> root hub port being reset independently of its connected device. I
>> think Sarah is correct that it could be limited to root hubs, though.
>
I think we also need something like the following to hold off khubd
while further resume actions may be taking place. Again we're
mitigated by the debounce delay in most cases, but if a warm reset
recovery is going to take longer than that khubd needs to be held off
for the given port.
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 30ce237569dd..a99e3eb28aa5 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -1173,22 +1173,25 @@ static void hub_activate(struct usb_hub *hub,
enum hub_activation_type type)
port_resumed))
set_bit(port1, hub->change_bits);
- } else if (udev->persist_enabled) {
+ } else {
struct usb_port *port_dev = hub->ports[port1 - 1];
+ if (udev->persist_enabled) {
#ifdef CONFIG_PM
- udev->reset_resume = 1;
+ udev->reset_resume = 1;
#endif
- /* Don't set the change_bits when the device
- * was powered off.
- */
- if (port_dev->power_is_on)
- set_bit(port1, hub->change_bits);
+ /* Don't set the change_bits when the device
+ * was powered off.
+ */
+ if (port_dev->power_is_on)
+ set_bit(port1, hub->change_bits);
+ }
- } else {
- /* The power session is gone; tell khubd */
- usb_set_device_state(udev, USB_STATE_NOTATTACHED);
- set_bit(port1, hub->change_bits);
+ /* The power session may be gone; wait for port
+ * recovery before letting khubd take action on
+ * this port
+ */
+ set_bit(port1, hub->delayed_change_bits);
}
}
@@ -3285,6 +3288,11 @@ int usb_port_resume(struct usb_device *udev,
pm_message_t msg)
usb_unlocked_enable_lpm(udev);
}
+ if (test_and_clear_bit(port1, hub->delayed_change_bits)) {
+ set_bit(port1, hub->change_bits);
+ kick_khubd(hub);
+ }
+
return status;
}
diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h
index 4e4790dea343..74e87c0380f8 100644
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -45,8 +45,9 @@ struct usb_hub {
unsigned long event_bits[1]; /* status change bitmask */
unsigned long change_bits[1]; /* ports with logical connect
status change */
- unsigned long busy_bits[1]; /* ports being reset or
- resumed */
+ unsigned long busy_bits[1]; /* ports being reset */
+ unsigned long delayed_change_bits[1]; /* ports awaiting
+ recovery */
unsigned long removed_bits[1]; /* ports with a "removed"
device present */
unsigned long wakeup_bits[1]; /* ports that have signaled
Attachment:
warm-reset-resume
Description: Binary data
Attachment:
delay-khubd
Description: Binary data