Re: Driver retries disk errors.

From: James Courtier-Dutton
Date: Mon Aug 30 2004 - 13:44:38 EST


Theodore Ts'o wrote:
On Mon, Aug 30, 2004 at 06:39:31PM +0200, Rogier Wolff wrote:

We encounter "bad" drives with quite a lot more regularity than other
people (look at the Email address). We're however, wondering why the
IDE code still retries a bad block 8 times?


I could see retrying 2 or 3 times, but 8 times does seem to be a bit
much, agreed.


In fact we regularly are able to recover data from drives: we have a
userspace application that retries over and over again, and this
sometimes recovers "marginal" blocks. This could be considered "good
practise" if there is a filesystem requesting the block. On the other
hand, when this happens, the drive is usually beyond being usable for
a filesystem: if we recover one block this way, the next block will be
errorred and the filesystem "crashes" anyway. In fact this behaviour
may masquerade the first warnings that something is going wrong....


If the block gets successfully read after 2 or 3 tries, it might be a
good idea for the kernel to automatically do a forced rewrite of the
block, which should cause the disk to do its own disk block
sparing/reassignment.

- Ted

It does the same retries with CD-ROM and DVDs, and if the retries fail, it disables DMA! It even does the retries when reading CD-Audio.
Maybe there should be a "retrys" setting that can be set by hdparm, then we could set the retry counts, and what happens when a retry fails on a per device basis.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/