Re: Intel ICH9M/M-E SATA error-handling/reset problems

From: Robert Hancock
Date: Sun Feb 15 2009 - 13:04:59 EST


Serguei Miridonov wrote:
Hello Robert and Jeff,

Thank you for your replies.

On Saturday 14 February 2009, Jeff Garzik wrote:
Serguei Miridonov wrote:
I have some problems with SATA in a new notebook PC (HP Pavilion
dv5t, Intel chipset). Seagate FreeAgent Pro 1TB external drive
practically can not be used with eSATA in Linux (fresh install
from DVD Fedora 10, now fully updated), and yesterday I also had
problem with DVD recording using internal HL-DT-ST BDDVDRW drive.
Some eSata fixes went into the more-recent kernels... Can you try
2.6.29-rc5?

Unfortunately, right now I can not provide a good testing bed for a new kernel. I was also thinking about bad cable and returned it to the store. Recording DVDs, as you understand, can not be considered for testing: I don't do it on regular basis... I will be looking for a new eSATA cable in a week or two, so when I have it I'll try to download and build the kernel for these experiments.

On Saturday 14 February 2009, Robert Hancock wrote:
Serguei Miridonov wrote:
Both devices work with Windows Vista. Seagate external drive even
in Vista produces "parity error" messages in Windows event log
but OS is somehow recovering from these errors and continues to
use the drive with slight slowdown (average speed varies between
60 and 110 MB/s). Of course, it could be cable/Seagate issue, but
again - Vista can handle this.
There are a lot of issues with eSATA drives and cabling. As Jeff
mentioned, there are some changes in 2.6.29-rc that may improve the
behavior, but the root cause here is a hardware issue (you should
not expect very good behavior in Vista either with those errors).

I agree with you completely. Nevertheless, something like 10 errors per 2GB transfer can not be the reason to give up. Vista, at least, recovers and continues the data transfer. Linux simply can not return the interface or connected device into operating mode. Do you think it is normal?

Could be that Linux is being a bit more aggressive on error handling. In your case, it looks like an error occurred, triggering a hard reset of the device, and the controller seemed unable to talk to the device afterwards. If the command had just been retried, maybe it would have worked better. However, doing that in general can cause issues since you don't know what the state of the link may be..


As far as the DVD burning issue, it's hard to say for sure. It
looks like a write command was timing out. Could be due to your
drive not working well with that type of media.

Well, it could be, though I did not consider Sony DVD+R as bad media. My fault may be... Anyway, even if it true, why k3b (or whatever backend used for recording) just can not establish connection with a drive because of kernel which must keep hardware working even is there were some intermittent interface errors.

It appears that Linux kernel has problems with
error-handling/reset of SATA hardware. I have found a lot of
reports regarding SATA problems: data transfer failures, CD/DVD
recording, waking up from suspend to RAM, etc. Aren't they all
related? Can Linux SATA chipsets drivers
Not related at all, mostly.. though a lot of people seem to think
they are. Often times people think problems are related because the
error messages seem similar, and even the same error can be
triggered by numerous different problems, often not the fault of
the kernel.

I'm not talking now about errors triggered by the kernel due to some bugs. What I see in the logs, this is the kernel fault to recover from errors, not causing it. I hope that this is fixed already in newer kernels, though I could not find such information in changelogs.

I could be wrong, of course, but it seems to me that if kernel can really reset the interface and return it and connected devices to operating mode, then most of issues mentioned above may become not so critical and people could live with them until root cause is fixed properly.

May be resetting the interface will not help is all cases if a device is left in some screwed up state due to earlier poor error handling... Well, this is another issue which can be device-vendor-dependent... However, regarding external Seagate drive, Vista does not have any special driver to handle its errors, it just works...

See above..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/