Re: Linux kernel - Libata bad block error handling to user mode program

From: Alan Cox
Date: Fri Mar 05 2010 - 08:02:43 EST

> For clarity, most ATA class disk drives are spec'ed to have one
> non-recoverable error per 150TB or so of writes. Disk drives do blind
> writes. (ie. They are not verified). So we should all expect to have
> the occasional silent data corruption on write. The problem is
> compounded with bad cables, controllers, RAM, etc.

Cable errors should only be a PATA issue, SATA protects the command block
and the data.

> The only way for the linux kernel even attempt to fix that is for it
> to do a read verify on everything it writes. For the vast majority of
> uses that is just not acceptable for performance reasons.

It's also the wrong layer

> OTOH, if data integrity is of the utmost for you, then you should
> maintain a md5hash or similar for your critical files and verify them
> any time you make a copy. btrfs may offer a auto read-verify. I
> don't know much about btrfs.

If you deal with utterly enormous amounts of data (as some clusters and
the like do) you protect your data from application to application layer.
It leaves ECC protected memory and it comes back to ECC protected memory
with a hash. That covers a lot of the errors by the OS, hardware, cables,
busses - even drive memory.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at