Re: Linux 2.6.29

From: Rik van Riel
Date: Mon Mar 30 2009 - 15:26:18 EST


Linus Torvalds wrote:
On Mon, 30 Mar 2009, Ric Wheeler wrote:

Heat is a major killer of spinning drives (as is severe cold). A lot of times,
drives that have read errors only (not failed writes) might be fully
recoverable if you can re-write that injured sector.

It's not worked for me, and yes, I've tried.

It's worked here. It would be nice to have a device mapper module
that can just insert itself between the disk and the higher device
mapper layer and "scrub" the disk, fetching unreadable sectors from
the other RAID copy where required.

I'm sure it works for some "ok, the write just failed to take, and the CRC was bad" case, but that's apparently not what I've had. I suspect either the track markers got overwritten (and maybe a disk-specific low-level reformat would have helped, but at that point I was not going to trust the drive anyway, so I didn't care), or there was actual major physical damage due to heat and/or head crash and remapping was just not able to cope.

Maybe a stupid question, but aren't tracks so small compared to
the disk head that a physical head crash would take out multiple
tracks at once? (the last on I experienced here took out a major
part of the disk)

Another case I have seen years ago was me writing data to a disk
while it was still cold (I brought it home, plugged it in and
started using it). Once the drive came up to temperature, it
could no longer read the tracks it just wrote - maybe the disk
expanded by more than it is willing to seek around for tracks
due to thermal correction? Low level formatting the drive
made it work perfectly and I kept using it until it was just
too small to be useful :)

And my point is, IT MAKES SENSE to just do the elevator barrier, _without_ the drive command.

No argument there. I have seen NCQ starvation on SATA disks,
with some requests sitting in the drive for seconds, while
the drive was busy handling hundreds of requests/second
elsewhere...

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/