That's a strange statement, maybe we could get some clarification on it? From the dmesg lines you posted before, it appeared that the hardware was failing the request with a bad disk sense code. As I said before, normally Linux has no problem reading the good parts of a partially bad disk, so I wonder exactly what Mark means by "upper layers which are only zero fault tollerant"?
Some of the fakeraid controllers will kill the disk when the
disk returns a failure like that.
On top of that usually (even if the controller were not to
kill the disk) the application will get a fatal disk error
also, causing the application to die.
The best I have been able to hope for (this is a raid0 stripe
case) is that the fakeraid controller does not kill the disk,
returns the disk error to the higher levels and lets the application
be killed, at least in this case you will likely know the disk
has a fatal error, rather than (in the raid0 case) having the
machine crash, and have to debug it to determine exactly
what the nature of the failure was.
The same may need to be applied when the array is already
in degraded mode ... limping along with some lost data and messages
indicating such is a lot better that losing all of the data.
Roger