Re: Crashed Drive, libata wedges when trying to recover data

From: Alan Cox
Date: Fri Sep 03 2004 - 07:21:35 EST


On Gwe, 2004-09-03 at 05:52, Greg Stark wrote:
> I get the same message and the same basic symptom -- any process touching the
> bad disk goes into disk-wait for a long time. But whereas before as far as I
> know they never came out, now they seem to come out of disk-wait after a good
> long time. But then maybe I just never waited long enough with 2.6.6.

This looks hopeful. You are now seeing the IDE layer error dump. Right
now it doesn't decode the LBA block number although that data is
available in the taskfile so I can knock up a test patch for you to try
if you want.

> This means I would be able to do the recovery in theory, but in practice it'll
> just take an infeasible length of time. I have gigs of data to go through and
> at the amount of time it takes to time out after each error it'll take me many
> days (years I think) to just to figure out which blocks to avoid.

Open the disk device directly with O_DIRECT, read in something like 64K
chunks. That won't do readahead so it gets easier to work out the
problem areas. You can now sit in a loop doing

if(pread() failed)
write blank
log
else
write data

then go back and binary search the holes it logs the next morning.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/