Re: IO error semantics

From: Ric Wheeler
Date: Mon Jan 25 2010 - 12:50:58 EST

On 01/25/2010 12:47 PM, tytso@xxxxxxx wrote:
On Mon, Jan 25, 2010 at 10:23:57AM -0500, Ric Wheeler wrote:

For permanent write errors, I would expect any modern drive to do a
sector remapping internally. We should never need to track this kind
of information for any modern device that I know of (S-ATA, SAS,
SSD's and raid arrays should all handle this).

... and if the device is run out of all of its blocks in its spare
blocks pool, it's probably well past the time to replace said disk.

BTW, I really liked Dave Chinner's summary of the issues involved; I
ran into Kawai-san last week at, and we discussed pretty
much the same thing over lunch. (i.e., that it's a hard problem, and
in some cases we need to retry the writes, such as a transient FC path
problem --- but some kind of write throttling is critical or we could
end up choking the VM due to too many pages getting dirtied and no way
of cleaning them.)

- Ted

Also note that retrying writes (or reads for that matter) often are counter productive. For those of us who have suffered with trying to migrate data off of an old, failing disk onto a new, shiny one, excessive retries can be painful...


