Re: SATA problems

From: Pablo Sebastian Greco
Date: Mon Jan 08 2007 - 07:22:30 EST


Tejun Heo wrote:
Pablo Sebastian Greco wrote:
After an uptime of 13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 2.6.18.x?

I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1 > /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

Just an off topic question, does anyone know why I get so uneven IRQ
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?

I dunno. You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line. People usualyl
screen threads by subject. There are just too many message in LKML for
anyone to follow all the message.

Thanks.

Guess I spoke too soon :(
Today I found this
Jan 8 04:01:40 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jan 8 04:01:40 squid kernel: ata2.00: cmd 25/00:08:49:ee:e8/00:00:16:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan 8 04:01:40 squid kernel: res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan 8 04:01:40 squid kernel: ata2: soft resetting port
Jan 8 04:01:40 squid kernel: ata2: softreset failed (port busy but CLO unavailable)
Jan 8 04:01:40 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan 8 04:01:45 squid kernel: ata2: hard resetting port
Jan 8 04:01:53 squid kernel: ata2: port is slow to respond, please be patient (Status 0x80)
Jan 8 04:02:16 squid kernel: ata2: port failed to respond (30 secs, Status 0x80)
Jan 8 04:02:16 squid kernel: ata2: COMRESET failed (device not ready)
Jan 8 04:02:16 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan 8 04:02:21 squid kernel: ata2: hard resetting port
Jan 8 04:02:21 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan 8 04:02:21 squid kernel: ata2.00: configured for UDMA/133
Jan 8 04:02:21 squid kernel: ata2: EH complete
Jan 8 04:02:21 squid kernel: SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB)
Jan 8 04:02:21 squid kernel: sdb: Write Protect is off
Jan 8 04:02:21 squid kernel: SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA
#uptime
10:10:12 up 3 days, 22:48, 1 user, load average: 0.22, 0.19, 0.18
4 am is the lowest load ever, so I don't get it.
I've found two differences with older errors
SAct is now 0x0 when before was 0x7fffffff
And the cmd/res used to be really long, now it's just one command
About heavy loading the seagate, I've tested as suggested on other thread dd if=<drive> of=/dev/null
for all 4 drives simultaneously, on top of usual load, and all was perfect with current kernel (2.6.20-rc3 + blacklist).
Don't know what to do to help

Thanks.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/