ATA error with Asus A7VX board

From: Neil Schemenauer
Date: Thu Jan 14 2010 - 18:20:48 EST


Hi,

I have what seems to be a poorly designed board (based on the number
of problems that a Google search turns up). However, since I guess
there are still people that have it and I have some capability to do
some debugging I thought I should try to fix it rather than throwing
it out.

It is possible that my problem is caused by a bad cable, bad memory,
or a bad drive, but I suspect the board is just temperamental
(specifically, the way interrupts are handled). The drive had a lot
of bad sectors a while ago but writing over them seemed to have
fixed it. The drive was working without errors with another board
for a while now.

I've tried booting with and without the "noapic" command line option
and with the old ide driver and the pata_via driver. Each
configuration seems to have its own problems. Certain configurations
generated errors like "IRQ nobody cared" and "spurious interrupt".
The noapic and ide driver combination results in the following error:

[ 366.463909] spurious 8259A interrupt: IRQ7.
[ 5432.078546] hda: task_no_data_intr: status=0x30 { DeviceFault SeekComplete }
[ 5432.078555] hda: possibly failed opcode: 0xb0
[ 5462.224012] hda: lost interrupt

The "noapic" and pata_via is what I'm running now. It works for a
while and then I get an error like the following:

ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata4.00: BMDMA stat 0x65
ata4.00: failed command: READ DMA
ata4.00: cmd c8/00:18:4f:27:1b/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/10:18:4f:27:1b/00:00:00:00:00/a0 Emask 0x81 (invalid argument)
ata4.00: status: { DRDY ERR }
ata4.00: error: { IDNF }
ata4.00: configured for UDMA/100
ata4.01: configured for UDMA/100
sd 3:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08
sd 3:0:0:0: [sda] Sense Key : 0xb [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
00 1b 27 4f
sd 3:0:0:0: [sda] ASC=0x14 ASCQ=0x0
sd 3:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 1b 27 4f 00 00 18 00
end_request: I/O error, dev sda, sector 1779535
ata4: EH complete

I'm attaching the kernel log for the pata_via and ide setups and the
output from lspci.

Best regards,

Neil
00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
Subsystem: ASUSTeK Computer Inc. A7V8X motherboard
Flags: bus master, 66MHz, medium devsel, latency 0
Memory at f8000000 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0
Capabilities: [c0] Power Management version 2

00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: e7800000-e7ffffff
Prefetchable memory behind bridge: eff00000-f7ffffff
Capabilities: [80] Power Management version 2

00:07.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46) (prog-if 10 [OHCI])
Subsystem: VIA Technologies, Inc. IEEE 1394 Host Controller
Flags: bus master, stepping, medium devsel, latency 32, IRQ 10
Memory at e7000000 (32-bit, non-prefetchable) [size=2K]
I/O ports at b800 [size=128]
Capabilities: [50] Power Management version 2

00:08.0 RAID bus controller: Promise Technology, Inc. PDC20376 (FastTrak 376) (rev 02)
Subsystem: ASUSTeK Computer Inc. A7V8X motherboard
Flags: bus master, 66MHz, medium devsel, latency 96, IRQ 10
I/O ports at b400 [size=64]
I/O ports at b000 [size=16]
I/O ports at a800 [size=128]
Memory at e6800000 (32-bit, non-prefetchable) [size=4K]
Memory at e6000000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [60] Power Management version 2

00:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)
Subsystem: ASUSTeK Computer Inc. A7V8X motherboard
Flags: bus master, fast devsel, latency 32, IRQ 12
Memory at e5800000 (32-bit, non-prefetchable) [size=8K]
Expansion ROM at efef0000 [disabled] [size=16K]
Capabilities: [40] Power Management version 2

00:0b.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
Subsystem: Intel Corporation Unknown device 0009
Flags: bus master, medium devsel, latency 32, IRQ 4
Memory at ef000000 (32-bit, prefetchable) [size=4K]
I/O ports at a400 [size=32]
Memory at e5000000 (32-bit, non-prefetchable) [size=1M]
[virtual] Expansion ROM at 40000000 [disabled] [size=1M]
Capabilities: [dc] Power Management version 1

00:0e.0 Multimedia video controller: Internext Compression Inc iTVC16 (CX23416) MPEG-2 Encoder (rev 01)
Subsystem: Hauppauge computer works Inc. WinTV PVR 250
Flags: bus master, medium devsel, latency 64, IRQ 10
Memory at e8000000 (32-bit, prefetchable) [size=64M]
Capabilities: [44] Power Management version 2

00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. VT6202 USB2.0 4 port controller
Flags: bus master, medium devsel, latency 32, IRQ 3
I/O ports at a000 [size=32]
Capabilities: [80] Power Management version 2

00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. VT6202 USB2.0 4 port controller
Flags: bus master, medium devsel, latency 32, IRQ 3
I/O ports at 9800 [size=32]
Capabilities: [80] Power Management version 2

00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 80) (prog-if 00 [UHCI])
Subsystem: ASUSTeK Computer Inc. VT6202 USB2.0 4 port controller
Flags: bus master, medium devsel, latency 32, IRQ 3
I/O ports at 9400 [size=32]
Capabilities: [80] Power Management version 2

00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82) (prog-if 20 [EHCI])
Subsystem: ASUSTeK Computer Inc. A7V8X motherboard
Flags: bus master, medium devsel, latency 32, IRQ 3
Memory at e4800000 (32-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 2

00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
Subsystem: ASUSTeK Computer Inc. A7V8X motherboard
Flags: bus master, stepping, medium devsel, latency 0
Capabilities: [c0] Power Management version 2

00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP])
Subsystem: ASUSTeK Computer Inc. A7V8X / A7V333 motherboard
Flags: bus master, medium devsel, latency 32, IRQ 255
[virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
[virtual] Memory at 000003f0 (type 3, non-prefetchable) [size=1]
[virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
[virtual] Memory at 00000370 (type 3, non-prefetchable) [size=1]
I/O ports at 9000 [size=16]
Capabilities: [c0] Power Management version 2

00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 50)
Subsystem: ASUSTeK Computer Inc. A7V8X Motherboard (Realtek ALC650 codec)
Flags: medium devsel, IRQ 7
I/O ports at e000 [size=256]
Capabilities: [c0] Power Management version 2

01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV200 QW [Radeon 7500] (prog-if 00 [VGA])
Subsystem: ATI Technologies Inc Radeon 7500
Flags: bus master, stepping, 66MHz, medium devsel, latency 64, IRQ 11
Memory at f0000000 (32-bit, prefetchable) [size=128M]
I/O ports at d800 [size=256]
Memory at e7800000 (32-bit, non-prefetchable) [size=64K]
Expansion ROM at effe0000 [disabled] [size=128K]
Capabilities: [58] AGP version 2.0
Capabilities: [50] Power Management version 2

smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model: ST3200822A
Serial Number: 3LJ1SWF5
Firmware Version: 3.01
User Capacity: 200,049,647,616 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Thu Jan 14 17:08:55 2010 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 37) The self-test routine was interrupted
by the host with a hard or soft reset.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 111) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 056 045 006 Pre-fail Always - 169686569
3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 267
5 Reallocated_Sector_Ct 0x0033 099 099 036 Pre-fail Always - 42
7 Seek_Error_Rate 0x000f 082 061 030 Pre-fail Always - 193571553
9 Power_On_Hours 0x0032 083 083 000 Old_age Always - 15179
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 203
194 Temperature_Celsius 0x0022 038 046 000 Old_age Always - 38
195 Hardware_ECC_Recovered 0x001a 056 044 000 Old_age Always - 169686569
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 189 000 Old_age Always - 11
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

SMART Error Log Version: 1
ATA Error Count: 12042 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 12042 occurred at disk power-on lifetime: 14108 hours (587 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 4f 04 2b e0 Error: UNC 8 sectors at LBA = 0x002b044f = 2819151

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 4f 04 2b e0 00 08:10:32.287 READ DMA
27 00 00 00 00 00 e0 00 08:10:28.361 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 08:10:28.353 IDENTIFY DEVICE
ef 03 44 00 00 00 a0 02 08:10:28.334 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 08:10:28.331 READ NATIVE MAX ADDRESS EXT

Error 12041 occurred at disk power-on lifetime: 14108 hours (587 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 4f 04 2b e0 Error: UNC 8 sectors at LBA = 0x002b044f = 2819151

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 4f 04 2b e0 00 08:10:20.345 READ DMA
27 00 00 00 00 00 e0 00 08:10:28.361 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 08:10:28.353 IDENTIFY DEVICE
ef 03 44 00 00 00 a0 02 08:10:28.334 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 08:10:28.331 READ NATIVE MAX ADDRESS EXT

Error 12040 occurred at disk power-on lifetime: 14108 hours (587 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 4f 04 2b e0 Error: UNC 8 sectors at LBA = 0x002b044f = 2819151

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 4f 04 2b e0 00 08:10:20.345 READ DMA
27 00 00 00 00 00 e0 00 08:10:20.343 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 08:10:20.342 IDENTIFY DEVICE
ef 03 44 00 00 00 a0 02 08:10:20.323 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 08:10:16.443 READ NATIVE MAX ADDRESS EXT

Error 12039 occurred at disk power-on lifetime: 14108 hours (587 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 4f 04 2b e0 Error: UNC 8 sectors at LBA = 0x002b044f = 2819151

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 4f 04 2b e0 00 08:10:20.345 READ DMA
27 00 00 00 00 00 e0 00 08:10:20.343 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 08:10:20.342 IDENTIFY DEVICE
ef 03 44 00 00 00 a0 02 08:10:20.323 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 08:10:16.443 READ NATIVE MAX ADDRESS EXT

Error 12038 occurred at disk power-on lifetime: 14108 hours (587 days + 20 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 4f 04 2b e0 Error: UNC 8 sectors at LBA = 0x002b044f = 2819151

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 4f 04 2b e0 00 08:10:12.412 READ DMA
27 00 00 00 00 00 e0 00 08:10:12.408 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 02 08:10:12.400 IDENTIFY DEVICE
ef 03 44 00 00 00 a0 02 08:10:12.381 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 e0 00 08:10:16.443 READ NATIVE MAX ADDRESS EXT

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short captive Interrupted (host reset) 50% 14108 -
# 2 Short offline Completed without error 00% 14108 -
# 3 Short offline Completed: read failure 90% 14108 2819151
# 4 Extended offline Interrupted (host reset) 60% 14027 -
# 5 Short offline Completed without error 00% 14026 -
# 6 Extended captive Interrupted (host reset) 90% 14026 -
# 7 Extended captive Interrupted (host reset) 90% 14026 -
# 8 Short captive Interrupted (host reset) 50% 14025 -
# 9 Short captive Interrupted (host reset) 50% 14025 -
#10 Short captive Interrupted (host reset) 50% 14025 -
#11 Extended captive Completed: read failure 90% 14025 1907693
#12 Extended captive Completed: read failure 90% 14025 1906646
#13 Extended captive Completed: read failure 90% 14025 1905599
#14 Short offline Completed without error 00% 13374 -
#15 Short offline Completed: read failure 90% 13374 1905423
#16 Short offline Completed without error 00% 9986 -
#17 Short offline Completed without error 00% 4741 -
#18 Short offline Completed without error 00% 291 -
#19 Short offline Completed without error 00% 160 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Attachment: kernlog-ide.txt.gz
Description: Binary data

Attachment: kernlog-pata.txt.gz
Description: Binary data