Re: [PATCH v3] usb: Add a new quirk to let buggy hub enable and disable LPM during suspend and resume

From: Mathias Nyman
Date: Mon Oct 21 2019 - 09:57:20 EST


On 18.10.2019 21.59, Greg Kroah-Hartman wrote:
On Thu, Oct 17, 2019 at 02:33:00PM +0800, Kai-Heng Feng wrote:


On Oct 4, 2019, at 03:04, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:

On Fri, 4 Oct 2019, Kai-Heng Feng wrote:

Dell WD15 dock has a topology like this:
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 10000M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/7p, 5000M
|__ Port 2: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M

Their IDs:
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 004 Device 002: ID 0424:5537 Standard Microsystems Corp.
Bus 004 Device 004: ID 0bda:8153 Realtek Semiconductor Corp.

Ethernet cannot be detected after plugging ethernet cable to the dock,
the hub and roothub get runtime resumed and runtime suspended
immediately:
...
[ 433.315169] xhci_hcd 0000:3a:00.0: hcd_pci_runtime_resume: 0
[ 433.315204] usb usb4: usb auto-resume
[ 433.315226] hub 4-0:1.0: hub_resume
[ 433.315239] xhci_hcd 0000:3a:00.0: Get port status 4-1 read: 0x10202e2, return 0x10343
[ 433.315264] usb usb4-port1: status 0343 change 0001
[ 433.315279] xhci_hcd 0000:3a:00.0: clear port1 connect change, portsc: 0x10002e2
[ 433.315293] xhci_hcd 0000:3a:00.0: Get port status 4-2 read: 0x2a0, return 0x2a0
[ 433.317012] xhci_hcd 0000:3a:00.0: xhci_hub_status_data: stopping port polling.
[ 433.422282] xhci_hcd 0000:3a:00.0: Get port status 4-1 read: 0x10002e2, return 0x343

At this point the SMSC hub (usb 4-1) enters into compliance mode
(USB_SS_PORT_LS_COMP_MOD), and USB core tries to warm-reset it,

[ 433.422307] usb usb4-port1: do warm reset
[ 433.422311] usb 4-1: device reset not allowed in state 8
[ 433.422339] hub 4-0:1.0: state 7 ports 2 chg 0002 evt 0000
[ 433.422346] xhci_hcd 0000:3a:00.0: Get port status 4-1 read: 0x10002e2, return 0x343
[ 433.422356] usb usb4-port1: do warm reset
[ 433.422358] usb 4-1: device reset not allowed in state 8
[ 433.422428] xhci_hcd 0000:3a:00.0: set port remote wake mask, actual port 0 status = 0xf0002e2
[ 433.422455] xhci_hcd 0000:3a:00.0: set port remote wake mask, actual port 1 status = 0xe0002a0
[ 433.422465] hub 4-0:1.0: hub_suspend
[ 433.422475] usb usb4: bus auto-suspend, wakeup 1
[ 433.426161] xhci_hcd 0000:3a:00.0: xhci_hub_status_data: stopping port polling.
[ 433.466209] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.510204] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.554051] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.598235] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.642154] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.686204] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.730205] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.774203] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.818207] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.862040] xhci_hcd 0000:3a:00.0: port 0 polling in bus suspend, waiting
[ 433.862053] xhci_hcd 0000:3a:00.0: xhci_hub_status_data: stopping port polling.
[ 433.862077] xhci_hcd 0000:3a:00.0: xhci_suspend: stopping port polling.
[ 433.862096] xhci_hcd 0000:3a:00.0: // Setting command ring address to 0x8578fc001
[ 433.862312] xhci_hcd 0000:3a:00.0: hcd_pci_runtime_suspend: 0
[ 433.862445] xhci_hcd 0000:3a:00.0: PME# enabled
[ 433.902376] xhci_hcd 0000:3a:00.0: restoring config space at offset 0xc (was 0x0, writing 0x20)
[ 433.902395] xhci_hcd 0000:3a:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100403)
[ 433.902490] xhci_hcd 0000:3a:00.0: PME# disabled
[ 433.902504] xhci_hcd 0000:3a:00.0: enabling bus mastering
[ 433.902547] xhci_hcd 0000:3a:00.0: // Setting command ring address to 0x8578fc001
[ 433.902649] pcieport 0000:00:1b.0: PME: Spurious native interrupt!
[ 433.902839] xhci_hcd 0000:3a:00.0: Port change event, 4-1, id 3, portsc: 0xb0202e2
[ 433.902842] xhci_hcd 0000:3a:00.0: resume root hub
[ 433.902845] xhci_hcd 0000:3a:00.0: handle_port_status: starting port polling.
[ 433.902877] xhci_hcd 0000:3a:00.0: xhci_resume: starting port polling.
[ 433.902889] xhci_hcd 0000:3a:00.0: xhci_hub_status_data: stopping port polling.
[ 433.902891] xhci_hcd 0000:3a:00.0: hcd_pci_runtime_resume: 0
[ 433.902919] usb usb4: usb wakeup-resume
[ 433.902942] usb usb4: usb auto-resume
[ 433.902966] hub 4-0:1.0: hub_resume
...

However the warm-reset never success, the asserted PCI PME keeps the
runtime-resume, warm-reset and runtime-suspend loop which never bring it back
and causing spurious interrupts floods.

After some trial and errors, the issue goes away if LPM on the SMSC hub
is disabled. Digging further, enabling and disabling LPM during runtime
resume and runtime suspend respectively can solve the issue.

So bring back the old LPM behavior as a quirk and use it for the SMSC
hub to solve the issue.

Fixes: d590c2311150 ("usb: Avoid unnecessary LPM enabling and disabling during suspend and resume")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>
---
v3:
- Add forgotten patch revision changelog.

v2:
- Explained by Alan, the hub should properly handle U3 -> U0 transition.
So use a quirk to target this buggy device only.

Documentation/admin-guide/kernel-parameters.txt | 3 +++
drivers/usb/core/hub.c | 15 +++++++++++++++
drivers/usb/core/quirks.c | 6 ++++++
include/linux/usb/quirks.h | 3 +++
4 files changed, 27 insertions(+)

Mathias may want to try something different to fix this problem. But
if he doesn't, this patch is okay with me.

Acked-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>

If there's no objection, can we merge this patch?

I wanted to have Mathias weigh in on this before merging it...


This might need some closer inspection still.

The "Get port status 4-1 read: 0x10202e" means port is not really in compliance mode,
instead port has CAS (Cold Attach Status) bit set, meaning parts of xHC needed for
link training were probably still powered off when device was plugged in, so device failed
to reach a connected, enabled, U0: link state. I needs to be warm reset.

there is no CAS link state in USB3 spec, so xhci driver reports a compliance mode link state
to usb core instead. Both states are resolved by a warm reset.

But looks like warm reset is refused as usb device state is still "suspended" in software:
"usb 4-1: device reset not allowed in state 8"

-Mathias