Re: Some hints needed how to handle SATA ALPM failures

From: Tejun Heo
Date: Fri Feb 18 2011 - 09:51:10 EST


Hello,

On Fri, Feb 18, 2011 at 01:58:09PM +0100, Stefan Bader wrote:
> We are hoping that those working more closely with the SATA code might
> be aware of this issue. As the symptoms are so severe (data corruption)
> we have ALPM disabled globally, but this does make it hard to get more
> targeted information on affected platforms.

What do you mean by data corruption? File system ro remount or actual
fs corruption? If actual fs corruption is happening, it's highly
likely that there's an underlying issue with the hardware. If data
corruption can be reproduced, can you please run smartctl -a before
and after such failure and post the outputs?

As for ro remounts, I recall applying fixes for that months ago. I
don't remember the details but some configurations raised extra PHY
event afterwards and command was failed without retry. Anyways, it
got fixed. Please dig through the log for details.

Also, the whole LPM thing got revamped several releases ago. Can you
please test how the recent kernels behave? There will be failures as
not all hardware can handle LPM well but those failures shouldn't lead
to any catastrophic failures like ro remounting of filesystem.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/