RE: Blockbusting news, results end

From: Mudama, Eric
Date: Mon Oct 27 2003 - 12:51:40 EST




>-----Original Message-----
>> If a drive wants to reallocate a block, but due to some temporary
>> condition is unable to (vibration, excessive temperature,
>> etc), odds are there's no way for that drive to "remember" that
>> it needs to reassign that block, so if you reboot the drive or
>> reset it or whatever, you're back at square 1.
>
> Bingo. This is why reallocation at the time of a failed read is also
> necessary. Yes the data are lost, yes the failure needs to
> be both logged (once) and displayed to the user (once), yes if an
> application reads it again before writing then it will be garbage
> or zeroes, but get the LBA sector number moved to a place that is
> less likely to be unreliable.
>
> Meanwhile software must still make up for defective firmware.
>

Reallocating on a failed read doesn't always make sense. Some huge
percentage of the errors on the media are caused by poor writes due to
various transient conditions (temperature, shock events, etc), and are not
actual media defects that prevent writing there in the future. If we get an
ECC error, the only thing we can "reallocate" is the stuff with the error in
it, in which case you're no closer to getting a good block of data than you
were prior to the reallocation.

If you try to write to that LBA, it should detect that you're writing to a
marginal area, and do some amount of tests to make sure that the new write
can be read.

Also, your term "defective firmware" is getting annoying. What, exactly,
should a drive that knows it cannot access the media due to severe
environmental conditions do in firmware to remember its problems between
power cycles?

--eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/