On Thu, 2009-03-19 at 23:32 -0400, Mark Lord wrote:..
..Allow SCSI to continue with the remaining blocks of a request
after encountering a media error. Otherwise, it may just fail
the entire request, even though some blocks were fine and needed
by a completely different process than the one that wanted the bad block(s).
Signed-off-by: Mark Lord <mlord@xxxxxxxxx>
--- linux-2.6.16.60-0.6/drivers/scsi/scsi_lib.c 2008-03-10 13:46:03.000000000 -0400
+++ linux/drivers/scsi/scsi_lib.c 2008-03-21 11:54:09.000000000 -0400
@@ -888,6 +888,12 @@
*/
if (sense_valid && !sense_deferred) {
switch (sshdr.sense_key) {
+ case MEDIUM_ERROR:
+ /* Bad sector. Fail it, and then continue the rest of the request. */
+ if (scsi_end_request(cmd, 0, cmd->device->sector_size, 1) == NULL) {
+ cmd->retries = 0; // go around again..
+ return;
+ }
But we've been over this. You can't apply something like this because
it ignores retries and chunks up the request a sector at a time. For
the enterprise that can increase failure time from a few seconds to
hours for 512k transfers.
Using the disk supplied data about where the error occurred (provided
the disk returns it) eliminates all the readahead problems like the one
above. Perhaps just turning of readahead for disks that don't supply
error location information would be a reasonable workaround?