Re: [REGRESSION] asix: Lots of asix_rx_fixup() errors and slow transmissions
From: Dean Jenkins
Date: Fri May 06 2016 - 11:00:34 EST
On 05/05/16 13:19, Guodong Xu wrote:
Hi, Dean
I am not sure why do you insist 'not full speed'. Actually, the tests
I run on ARM-64bit is at USB full speed mode. I pasted my log here:
http://paste.ubuntu.com/16236442/
, which includes the information you requested above, ifconfig, dmesg.
The interval between two consecutive errors varies from 10 to 40ms.
Your log from http://paste.ubuntu.com/16236442/ shows high speed for
device 3 is not being used:
[ 3.586968] usb 1-1: new full-speed USB device number 2 using dwc2
[ 3.792091] usb 1-1: not running at top speed; connect to a high
speed hub
[ 3.800477] hub 1-1:1.0: USB hub found
[ 3.803658] hub 1-1:1.0: 3 ports detected
[ 4.086636] usb 1-1.2: new full-speed USB device number 3 using dwc2
[ 4.202209] usb 1-1.2: not running at top speed; connect to a high
speed hub
[ 8.851236] asix 1-1.2:1.0 eth0: register 'asix' at
usb-f72c0000.usb-1.2, ASIX AX88772B USB 2.0 Ethernet, 00:0e:c6:fa:bf:fd
Hopefully, you know USB 2.0 high speed (480Mbps) is faster than full
speed (12Mbps) mode.
Therefore, your USB to Ethernet Adaptor is not running in its optimal
"normal" high speed operation and there is a USB hub in the way that is
not running at USB high speed mode. This is an abnormal configuration
and potentially explains some of your failure observations.
Running at full-speed (12Mbps) mode would explain why the timestamps has
gaps of ms rather than us gaps (for 480Mbps).
In other words, the full-speed hub is restricting the USB to Ethernet
Adaptor to a 12Mbps (half-duplex) bandwidth to support Ethernet 100Mbps
(full-duplex) traffic. That is not going to work very well because
Ethernet frames (perhaps partial Ethernet frames) need to be discarded
within the USB link.
Your ifconfig output from http://paste.ubuntu.com/16236442/ shows 249 errors
eth0 Link encap:Ethernet HWaddr 00:0e:c6:fa:bf:fd
inet addr:192.168.1.11 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:865 errors:249 dropped:0 overruns:0 frame:0
TX packets:880 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1228273 (1.1 MiB) TX bytes:68955 (67.3 KiB)
Before the test
RX packets:28 errors:0 dropped:0 overruns:0 frame:0
After the test
RX packets:865 errors:249 dropped:0 overruns:0 frame:0
Good test packets = 865 - 28 = 837
Detected bad Ethernet frames = 249
Bad to good ratio is 249:837 = 1:3.36 so 1 detected bad Ethernet frame
per 3.36 good Ethernet frames
Your ifconfig output from http://paste.ubuntu.com/16236764/ shows 1282
errors
eth0 Link encap:Ethernet HWaddr 00:0e:c6:fa:bf:fd
inet addr:192.168.1.11 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:55 errors:1282 dropped:0 overruns:0 frame:0
TX packets:64 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14287 (13.9 KiB) TX bytes:7639 (7.4 KiB)
Before the test
RX packets:19 errors:0 dropped:0 overruns:0 frame:0
After the test
RX packets:55 errors:1282 dropped:0 overruns:0 frame:0
Good test packets = 55 - 19 = 36
Detected bad Ethernet frames = 1282
Bad to good ratio is 1282:36 = 1:0.28 so 1 detected bad Ethernet frame
per 0.028 good Ethernet frames
This suggests a very high error rate.
It is interesting that the reported "remaining" value is 988. Is 988 always
shown ? I mean that do you see any other "remaining" values for the "Data
Header synchronisation was lost" error message ?
Yes and No. When doing iperf test in TCP mode, always 988. I have
never seen other "remaining" value.
But,
1. I tried "ping -f -s 1400 [my.arm.64bit.board.ip]", but this cannot
trigger the error.
2. Tried iperf in UDP mode, I saw "Data Header synchronisation was
lost" remaining value is 984 (again, seemingly always in several
tries). Log is pasted here. http://paste.ubuntu.com/16236764/
In http://paste.ubuntu.com/16236764/ you see very many
[ 41.938370] asix 1-1.2:1.0 eth0: asix_rx_fixup() Bad Header Length
0x11400040, offset 4
but only a few
[ 42.214607] asix 1-1.2:1.0 eth0: asix_rx_fixup() Data Header
synchronisation was lost, remaining 984
This suggests that the "Bad Header Length" and "Data Header
synchronisation was lost" error messages are not related to consecutive
URBs. The expectation is that a "Data Header synchronisation was lost"
error message is immediately followed by a "Bad Header Length" message
with a timestamp much less than 1ms (for high speed USB). This is
because an Ethernet frame that spans URBs needs low latency so should be
sent quickly in consecutive URBs.
The Bad Header Length error messages with offset 4 indicates that 32-bit
header word was not found in the expected location at the start of the
URB buffer.
[ 41.938370] asix 1-1.2:1.0 eth0: asix_rx_fixup() Bad Header Length
0x11400040, offset 4
This evidence is suggesting either the data stream is garbled such as by
many dropped URBs or lost partial Ethernet frames, or the ASIX protocol
for the AX88772B chipset differs from the AX88772 chipset so the ASIX
driver is looking in the wrong place for the 32-bit header word. I
suspect data is lost due to the restricted 12Mbps bandwidth.
My conclusion is that your USB to Ethernet Adaptor is not running at
high speed (480Mbps) mode which is causing a partial loss (corruption)
of Ethernet frames across the USB link. A USB Protocol Analyser or
software tool usbmon could be used to confirm this scenario.
Therefore please retest with a working high-speed USB hub or remove the
full-speed USB hub from the test environment and directly connect the
USB to Ethernet Adaptor to the root hub of the USB port. Then repeat the
tests to see whether anything improved.
In other words, you need to eliminate the dmesg messages saying "not
running at top speed; connect to a high speed hub".
Best regards,
Dean
-Guodong Xu
--
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.