Re: Possible EXT2 race

From: Robert Hancock
Date: Fri Dec 07 2007 - 22:45:22 EST


linux-os (Dick Johnson) wrote:
On Fri, 7 Dec 2007, Dave Jones wrote:

On Fri, Dec 07, 2007 at 08:15:42AM -0500, linux-os (Dick Johnson) wrote:

Dec 7 04:05:55 chaos kernel: sd 0:0:1:0: [sdb] Add. Sense: Peripheral device write fault
This sounds more like a hardware problem.

Dave


There was an attempt to write beyond the end of the device because
everything in the file-system was getting trashed. I can read/write
the 5 year-old SCSI physical drive with no errors from both within
linux and through the Adaptec BIOS. This problem only occurs
when I attempt to truncate a file that is being written by another
task.

That SCSI error code doesn't sound like a reasonable one for the drive getting a bad block address. The more typical one in that case would be "Logical block address out of range", or maybe the catch-all "Invalid field in CDB". "Peripheral device write fault", especially as a deferred error (i.e. after the drive already returned a normal completion for the data, and then is reporting the failure to actually write to the media on the next command), really sounds like a drive problem.

And the kernel is supposed to trap those at the disk layer, like these are saying it is, _after_ that error occurs:

Dec 7 04:08:13 chaos kernel: attempt to access beyond end of device
Dec 7 04:08:13 chaos kernel: sdb1: rw=0, want=29687515944, limit=33736437

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@xxxxxxxxxxxxx
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/