All filesystems hang under long periods of heavy load (read and write)on a filesystem
From: Shirley Shi
Date: Mon Nov 03 2003 - 18:47:17 EST
Can anyone know why all filesystems hang under periods of heavy load on
one of the filesystem? Once the filesystems hang, any command related to
the filesystem, like 'ls', 'cat',etc., will stick forever until re-power
cycling the machine.
I kept running the following script to read and write the data on a same
filesystem(ext2 or XFS) since we need do some tests for the storage. Is
half day, onn the beginning, the system was running well. But after
running the script for a long time, such a half day, one day or two
days, all filesystems would get hung, including the root filesystem
although I didn't do any heavy load on it. The file(M.1) I used for
reading and writing is about 2.5GB.
@ total = 115
while (1)
@ cc = 2
while ($cc <= $total)
dd bs=512k if=/data/M.1 of=/data/M.$cc
echo "copying $cc of $total..."
@ cc = $cc + 1
end
rm -f /data/M.*
end
I tried RH8.0 with kernel 2.4.18 and kernel 2.4.21 with XFS and patch
rmap15j. I have the same issue running with the two kernels. Basically I
have two filesytems configured. One for the root configured with ext3,
and another is for the data configured with ext2 or XFS. With either
ext2 or XFS, I have the same problem.
I also tried on different Dual CPU machines as follows, but saw the same
problem.
- A dual-CPU machine with an on-board 320 SCSI controller(running
AIC79XX.o driver) connecting a disk drive as the root system and a
MegaRAID 320-4X controller connecting to several disk drives. I created
two H/W logical RAID devices and built the data filesystem with the S/W
RAID0(/dev/md0).
- A HP dual-CPU machine with the HP Smart Array 5i connecting to a disk
drive as the root system and a 160 SCSI controller connecting with 13
disk drives and configured as a S/W RAID0(/dev/md0) the data filesystem.
- A dual-CPU machine with an on-board 320 SCSI controller(with AIC79XX.o
driver) connecting a disk drive as the root system and a FC controller
connecting to a RAID storage.
Any comment would be appreciated.
Thanks,
Shirley
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/