2.1.79: filesystem corruption with ext2fs deleting a lot of files

segmentation.fault@core.dumped.org
20 Jan 1998 10:24:57 GMT


I've had some severe problems with 2.1.79 (with the fatfs patch), apparently
it has a bug that can damage ext2fs partitions

I was removing a large series of files (first time it was a rm -rf, the second
time it was deleting with midnight commander). The first time, I deleted
a series of file, the harddisk was busy, but after a while I looked back,
and the rm just hung. CTRL+Z, CTRL+C failed, and ps gave it a 'D' in the
status (deadlock if I'm correct).

I was unable to kill the process, and another rm on the same dirs to get
rid of the files, also deadlocked. Apparently a lot of operations on the
partition on which the rms failed, caused that process to deadlock.

Also, an unmount on that partition simply deadlocked, and so did syncs.
Syncs/unmounts started via the magic alt+sysreq keys didn't help either.
I rebooted, fscks ran and corrected the usual deleted inodes, but yet another
rm attempt on the same directory system caused another deadlock, and the
whole story began again.

Then I went for a init=/bin/bash reboot, fsck-d the partitions with -f -v,
which reported a lot of errors (which I didn't write down), but after that
rm-ing went fine

A day later, I had to delete another large series of files on the same
partition. The same deadlocks happened again. I rebooted to 2.0.33, but
didn't bother with the directories yet. When I, another day later, decided
to continue deleting the files I had deleted the day before, 2.0.33
deadlocked with the exact same symptoms, many operations on the partition
causing the process to deadlock, no sync, no unmount. After the init=/bin/bash
and fsck -f -v rituals (which I did keep the error reports from), the system
worked fine again.

Apparently 2.1.79 does something real bad when removing files on an ext2fs
partition, and leaves it in such an ugly state that 2.0.33 deadlocks on it
too.

My partitions (/home was the partition on which both corruptions happened)

/dev/hda1 on / type ext2 (rw,bsddf,check=strict,errors=remount-ro)
/dev/hda2 on /dos type msdos (rw,noexec,nosuid,nodev,uid=777,gid=500,umask=022,quiet)
/dev/hda7 on /home type ext2 (rw,nodev,usrquota,bsddf)
/dev/hda3 on /home/local/unilynx type ext2 (rw,nosuid,nodev,bsddf,resuid=777)
/dev/hda9 on /tmp type ext2 (rw,nosuid,nodev,bsddf)
/dev/hda5 on /usr type ext2 (rw,nodev,bsddf)
/dev/hda8 on /var type ext2 (rw,nodev,noatime,bsddf)
none on /proc type proc (rw)
failing:(pid292) on /net type auto (intr,rw,port=1023,timeo=8,retrans=110,indirect,map=/etc/amd.conf)

The fsck report on the failing partition

e2fsck 1.10, 24-Apr-97 for EXT2 FS 0.5b, 95/08/09
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Inode 20271 ref count is 21, should be 20. Fix? yes

Pass 5: Checking group summary information

/dev/hda7: ***** FILE SYSTEM WAS MODIFIED *****

16026 inodes used (15%)
1552 non-contiguous inodes (9.7%)
# of inodes with ind/dind/tind blocks: 2961/143/0
371590 blocks used (90%)
0 bad blocks

14661 regular files
1240 directories
0 character device files
0 block device files
0 fifos
722 links
116 symbolic links (116 fast symbolic links)
0 sockets
--------
16739 files

-- 
    Arnold Hendriks          I know it doesn't have any bugs, I downloaded
 unilynx$irc.xnet.org        it right off of www.microsoft.com   --FinaLHouR