Major disk corruption + fsck problem ?

Michel LESPINASSE (walken@via.ecp.fr)
Thu, 2 Jan 1997 00:05:43 +0100 (MET)


Hi,

I've been the victim of a major disk corruption problem. I was running a
2.1.6 kernel from a few weeks, under a moderate load (about 70% of cpu
usage 12 hours a day, but this shouldn't be the problem) when I suddenly
got a dozen e2fs error messages on the console. I think that was the
"bit already cleared for inode ..." error, but I'm not sure.

reboot (cleanly with shutdown, not a reset), fsck in single user mode
(disk mounted read-only of course). then a lot of errors (for example lots
of block numbers > to my disk size). I didn't noticed any particular bit
pattern in this bogus block numbers (they looked like evenly distributed
32 bit numbers). Then a lot of other errors that I didn't even noticed
because I was much too bored.

Then a second fsck "just to check that everything got corrected". The
problem is, that I got a few new errors ! (but the third fsck was perfect)

Now all of my disk is gone, and all I can see in / is my lost+found
directory. I don't see it with ls, but with my shell completion feature :-/

I don't have any hope for my datas, but I think that the fsck problem on
the second run had to be reported. specifically, the errors I got on the
second run were all related to the ".." entryes of some directoryes. I
think that when e2fsck moves a directory to /lost+found, it "forgets" to
change the ".." entry. Is this an fsck bug, or am I becoming crazy ?

now a question : is there some known bugs in the 2.1.6 kernel that would
explain my corruption problem, or should I doubt about my hardware ? as I
said, I didn't noticed any particular bit patterns, but still I'm worried.
I wouldn't like this to happen again.

my (dead) configuration :
kernel 2.1.6 (with the memcpy patch)
e2fsck 1.06
IDE disk, triton chipset (DMA disk access used)
P133 with 32 megs (but I doubt this could be important)

Michel "Walken" LESPINASSE - Student at Ecole Centrale Paris (France)
www Email : walken@via.ecp.fr
(o o) VideoLan project : http://videolan.via.ecp.fr/
------oOO--(_)--OOo-------------------------------------------------------
Yow ! 1135 KB/s remote host TCP bandwidth over 10Mb/s ethernet. Beat that!