Crash Recovery Problems (fwd)

James G. Stallings II (zap@onyx.tarpon.net)
Tue, 11 Nov 1997 12:06:25 -0600 (CST)


We recently experienced a fairly devastating crash, resulting in the loss
of our root partition. This partition contained only the linux OS; our
/home and /root are seperate, and were relatively untouched (I lost the
contents of my personal mailbox, about 900+ messages).

We run RedHat, (the colgate release at the time of the crash), on kernel
version 2.0.18. The aic7xxx driver on the machine has long given us
problems, but in the period of a year, only one other significant crash,
which we recovered from with practically no complications.

When we crashed this time we brought up the release to the current one
(biltmore), with kernel version 2.0.30, no patches (we yet to get the
system stable enough to patch kernel sources).

The hardware this runs on is a dual P166 Tyan w/a seagate hawk (2.0GB) on
an adaptec 2940 Ultra. We are -not- currently using SMP.

My problem is this: I've installed a Western Digital Caviar drive, a 2.0
GB IDE disk, and doing just about anything meaningful with it (i.e., using
it as a backup device for the scsi drive) drives the system load average
up to 1.5 or 2.0 and generates a message like 'Couldn't get a Free Page'
when trying to format it as two partitions (1 of about 78MB, the other the
remaining space), or 'Nov 10 13:37:20 onyx kernel: Unable to handle kernel
paging request at virtual address ef6a006a' when doing it as four
partitions (the actual message occurred when formating the final partition
of 1.3 GB).

Piped tar commands to copy files across (at such time as I had
successfully formatted it before) also drove up the load average, even
when nice'd at 19, and ultimately crashed the system with the 'Couldn't
get a free page' message.

Any help or suggestions would be greatly appreciated.

Oh, BTW, the M/B uses the Intel Triton II IDE Chipset.

Thanks,
James