2.6.0-test2-mm1, ext3 (external journal): nasty filesystem corruptionunder high load

From: mb
Date: Tue Aug 12 2003 - 05:19:10 EST


Hi,

Admittedly I was being pathological, but I've got a new toy to play with!
Our new server is a dual-Athlon, 1.5G RAM (the other .5 failed memtest) +
about 6GB swap, with 15x70GB drives running under gdth.o with 12 as the
RAID-5 set, and the journal on 2 as a RAID-1 pair. System on IDE for now.

It's currently running Red Hat "severn" + 2.6.0-test2-mm1 (with PREEMPT
for now), and this particular stress test was attempting to build
2.6.0-test3-mm1 with the scary invocation "make -j". More info on request.

I saw thousands of messages like:
cc1: page allocation failure. order:0, mode:0x20
(where only the process names might change). I don't know how Bad this is.

Amazingly I could still ssh in to the box and discover that its load had
more than likely broken 1000. However, the compile started to complain
bitterly about non-ASCII characters in source files, and indeed corruption
did occur (random overwriting, it would appear).

I have a couple more weeks I can play with this box before it has to go
into production (running much older brains), and can do more tests if
anyone thinks it might be useful.

I would like to suggest the "make -j" test for those developers with
enough memory (and fast enough swap).

Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/