Re: ext2 corruption in 2.0.33

Gadi Oxman (gadio@netvision.net.il)
Mon, 26 Jan 1998 22:46:45 +0300 (IST)


On Mon, 26 Jan 1998, Theodore Y. Ts'o wrote:

> There has been exactly one other such report (although unfortunately I
> don't have the kernel version number) from Nick Holloway
> (Nick.Holloway@alfie.demon.co.uk) on September 30, 1997.
>
> My guess then, and it remains the same, since if there was a problem
> with the kernel we should have seen a lot more complaints than just
> these two isolated reports in over six months, was a hardware or DMA
> glitch which caused data to get written to the wrong place.
>
> It's possible that it could be a kernel problem, though, which is why I
> do keep track and file such reports when people raise them, in order to
> find patterns if they exist. So I'll file your report and see if anyone
> else reports these problems (I track linux-kernel and
> comp.os.linux.development.system looking for such bug reports.)
>
> - Ted

I have seen a strange filesystem corruption while we were working on
the RAID-5 reconstruction support back in September/October (2.0.31).

The RAID reconstruction thread runs the following loop:

for (each block) {
bh = bread(block); /* serviced using the operational drives */
mark_buffer_dirty(bh); /* writes will be serviced by all drives */
}
fsync_dev(...);

The reconstructed filesystem had a very interesting corruption pattern;
one block in an inode table was overwritten with an increasing:

0 1 2 3 4 5 .. 127

byte pattern, which was repeated on each 128 bytes boundary (every inode)
in the block. I recall that I tried to reproduce it, without success.

Gadi