Re: Asmedia USB 1343 crashes

From: Mathias Nyman
Date: Thu May 04 2017 - 11:01:21 EST

On 03.05.2017 22:20, Thomas Fjellstrom wrote:
On Wednesday, May 3, 2017 1:54:39 PM MDT Alan Stern wrote:
On Tue, 2 May 2017, Thomas Fjellstrom wrote:

I just had a brief lockup, desktop stopped responding, other usb devices not
on the usb3 controller. Two android devices were in the process of restarting

It doesn't seem to matter what android devices it is.

[294503.849350] ------------[ cut here ]------------
[294503.849362] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x223/0x230
[294503.849365] NETDEV WATCHDOG: enp4s0 (igb): transmit queue 0 timed out
[294503.849367] Modules linked in: sr_mod cdrom ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack br_netfilter overlay ebtable_filter ebtables ip6table_filter ip6_tables nfsv3 nfs_acl nfs lockd grace iptable_filter bridge stp llc amdgpu mfd_core fuse vfat fat eeepc_wmi asus_wmi rfkill edac_mce_amd edac_core pcspkr sg amdkfd radeon ttm sunrpc k10temp it87 hwmon_vid fam15h_power efivarfs ip_tables ipv6 autofs4 crc32c_intel i2c_piix4
[294503.849407] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc7 #8
[294503.849410] Hardware name: To be filled by O.E.M. To be filled by O.E.M./970 PRO GAMING/AURA, BIOS 0901 11/07/2016
[294503.849413] Call Trace:
[294503.849417] <IRQ>
[294503.849422] dump_stack+0x4d/0x63
[294503.849426] __warn+0xc6/0xe0
[294503.849430] warn_slowpath_fmt+0x46/0x50
[294503.849434] dev_watchdog+0x223/0x230
[294503.849438] ? qdisc_rcu_free+0x40/0x40
[294503.849442] call_timer_fn+0x30/0x160
[294503.849445] ? qdisc_rcu_free+0x40/0x40
[294503.849448] run_timer_softirq+0x1e1/0x440
[294503.849453] ? lapic_next_event+0x18/0x20
[294503.849456] ? sched_clock_cpu+0x11/0xd0
[294503.849459] __do_softirq+0x101/0x2f0
[294503.849463] irq_exit+0xb9/0xc0
[294503.849466] smp_apic_timer_interrupt+0x38/0x50
[294503.849470] apic_timer_interrupt+0x86/0x90
[294503.849474] RIP: 0010:acpi_idle_do_entry+0x2c/0x40
[294503.849476] RSP: 0018:ffffffffb2a03d90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[294503.849480] RAX: 0000000000000000 RBX: ffff884d1a966c00 RCX: 0000000000000034
[294503.849483] RDX: 4ec4ec4ec4ec4ec5 RSI: 0000000000000001 RDI: ffff884d1a966c64
[294503.849485] RBP: ffffffffb2a03dd0 R08: 00000000000003e3 R09: 0000000000000018
[294503.849487] R10: 00000000000003c1 R11: 00000000000003d4 R12: ffff884d1a966c64
[294503.849490] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001
[294503.849492] </IRQ>
[294503.849497] ? acpi_idle_enter+0xd7/0x290
[294503.849502] cpuidle_enter_state+0xed/0x2e0
[294503.849506] cpuidle_enter+0x12/0x20
[294503.849509] call_cpuidle+0x1e/0x30
[294503.849512] do_idle+0x179/0x1d0
[294503.849515] cpu_startup_entry+0x5d/0x60
[294503.849518] rest_init+0x7f/0x90
[294503.849522] start_kernel+0x405/0x412
[294503.849525] x86_64_start_reservations+0x24/0x26
[294503.849528] x86_64_start_kernel+0x182/0x193
[294503.849531] start_cpu+0x14/0x14
[294503.849534] ? start_cpu+0x14/0x14
[294503.849537] ---[ end trace 12db587e781d6e4f ]---
[294503.849558] igb 0000:04:00.0 enp4s0: Reset adapter
[294504.576629] xhci_hcd 0000:02:00.0: Stop command ring failed, maybe the host is dead
[294504.576656] xhci_hcd 0000:02:00.0: Abort command ring failed
[294504.576799] xhci_hcd 0000:02:00.0: xHCI host not responding to stop endpoint command.
[294504.576805] xhci_hcd 0000:02:00.0: Assuming host is dying, halting host.

At this point you have reached the limit of my knowledge. The best
person to help is Mathias Nyman, the xHCI maintainer (CC'ed).

For some reason stopping the command ring fails, ring is stopped by writing a
bit in a register, hardware is supposed to clear another bit in the same register
when ring is stopped. We poll for the second bit immediately after writing the first.
If second bit is not cleare after 5 seconds we bail out.

It could be that hardware never clears the bit.

You said you had two android phones connected, and both were restarting.
It could be a race in the command ring stopping code.

Can you reproduce this xhci with only one android device connected?