Re: 2.6.38: XFS/USB/HW issue, or failing USB stick?

From: Justin Piszcz
Date: Fri Mar 18 2011 - 13:47:58 EST




On Fri, 18 Mar 2011, Alan Stern wrote:

On Fri, 18 Mar 2011, Justin Piszcz wrote:

Hi,

I can write to just about the entire USB stick, with no errors:

atom:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 5.8G 1.5G 4.3G 26% /
tmpfs 2.0G 0 2.0G 0% /lib/init/rw
udev 10M 140K 9.9M 2% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
atom:~# cd /
atom:/# ls
bin cdrom etc lib media nfs proc sbin srv tmp var
boot dev home lib64 mnt opt root selinux sys usr
atom:/# dd if=/dev/zero of=bigfile bs=1M count=4000
4000+0 records in
4000+0 records out
4194304000 bytes (4.2 GB) copied, 135.536 s, 30.9 MB/s
atom:/# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 5.8G 5.4G 350M 95% /
tmpfs 2.0G 0 2.0G 0% /lib/init/rw
udev 10M 140K 9.9M 2% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
atom:/# rm bigfile

However, after some amount of time, the errors occur below, is this USB
stick failing? Since it has no SMART, is there any other way to verify
the 'health' of a USB stick?

None that I know of.

Mar 18 07:55:12 atom [ 10.034310] e1000e 0000:03:00.0: eth1: 10/100 speed: disabling TSO

[ .. no errors .. ]

Mar 18 08:32:44 atom [ 2261.883848] usb 1-1: USB disconnect, address 2
Mar 18 08:32:44 atom [ 2261.884465] Buffer I/O error on device sda2, logical block 1317256

The stick didn't "fail" in any obvious way, but for some reason it was
disconnected from the USB bus. (If it initiated that disconnect by
itself, I guess you could consider that a failure.) Maybe it was
something as simple as overheating causing a loss of electrical contact
between the connector and the pins in the USB port.

It is possible, but the box is kept cool:

w83627dhg-isa-0ca0
Adapter: ISA adapter
Vcore: +1.16 V (min = +0.72 V, max = +1.39 V) in1: +1.04 V (min = +0.94 V, max = +1.16 V) AVCC: +3.34 V (min = +2.96 V, max = +3.63 V) +3.3V: +3.34 V (min = +2.98 V, max = +3.63 V) in4: +1.84 V (min = +1.62 V, max = +1.98 V) in5: +1.26 V (min = +1.13 V, max = +1.38 V) in6: +0.75 V (min = +0.67 V, max = +0.83 V) 3VSB: +3.30 V (min = +2.96 V, max = +3.63 V) Vbat: +3.07 V (min = +2.96 V, max = +3.63 V) fan1: 0 RPM (min = 727 RPM, div = 64) ALARM
fan2: 0 RPM (min = 727 RPM, div = 64) ALARM
fan3: 0 RPM (min = 727 RPM, div = 64) ALARM
fan4: 1240 RPM (min = 712 RPM, div = 8)
fan5: 0 RPM (min = 727 RPM, div = 64) ALARM
temp1: +36.0°C (high = +75.0°C, hyst = +70.0°C) sensor = thermistor
temp2: +35.5°C (high = +90.0°C, hyst = +87.0°C) sensor = diode
temp3: +19.0°C (high = +80.0°C, hyst = +75.0°C) sensor = diode
cpu0_vid: +0.000 V

coretemp-isa-0000
Adapter: ISA adapter
Core 0: +12.0°C (crit = +100.0°C)

coretemp-isa-0001
Adapter: ISA adapter
Core 1: +16.0°C (crit = +100.0°C)


...
Mar 18 08:33:06 atom [ 2283.963059] usb 1-1: new high speed USB device using ehci_hcd and address 4
Mar 18 08:33:06 atom [ 2284.080647] usb 1-1: New USB device found, idVendor=0325, idProduct=ac02
Mar 18 08:33:06 atom [ 2284.080707] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Mar 18 08:33:06 atom [ 2284.080752] usb 1-1: Product: R2_TURBO
Mar 18 08:33:06 atom [ 2284.080794] usb 1-1: Manufacturer: OCZ Technology
Mar 18 08:33:06 atom [ 2284.080831] usb 1-1: SerialNumber: (removed)

And then 22 seconds later it reconnected.

Alan Stern


Very strange, could some USB option cause this? I guess next step is use
ext2 and a different stick in the same port to see if I can get it to recur.
Then if it happens again, try a different port.

Justin.