Re: File Corruption in Kernel 2.4.18

From: Kelledin (kelledin+LKML@skarpsey.dyndns.org)
Date: Wed Jul 17 2002 - 23:16:02 EST


Ok, the test:

I chose directory /home/kelledin/gnutella. It contains
approximately 10GB of files, ranging in size from ~5MB to
~700MB. Most are ~600-650MB.

System specs are here:
http://www.anandtech.com/mysystemrig.html?rigid=5092

Only out-of-date info on that page is the kernel--I'm running
2.4.18+XFS-1.0.2+RML-preempt, compiled with gcc-2.95.3. Kernel
was booted with "acpi=no-idle mem=nopentium" options.

While I was compiling jdk-1.4.0, I did the following:

[ kelledin@valhalla ~ ] # mkdir gnutella2
[ kelledin@valhalla ~ ] # cp -a gnutella gnutella2
[ kelledin@valhalla ~ ] # for FNAME in gnutella/*; do cmp
"$FNAME" "gnutella2/$FNAME"; done

The "cp -a" operation took 19 minutes, during which the system
load reached approximately 4.0 and the CPU temperature held at
54 C. Ambient case temperature held at 26 C. Swap usage did
not change. System was somewhat sluggish but responsive enough
to play an mp3 and allow me to open terminal windows. j2sdk
compile is still apparently going strong.

The comparison check...well, it finished while I was away
 getting a snack. It printed no output, which means the check
 probably completed successfully. Maybe I'll run some md5 sums
 later, just to be sure.

System load stayed at about 3.0, and temperatures remained
approximately the same as during the copy operation.

The relevant software:

kernel...well, you know.
glibc-2.2.5+linuxthreads+LSB+blowfish+math patches
libacl-2.0.11
libattr-2.0.8
bash-2.05a (Just for you, Hell.Surfers, just for you ;)
fileutils 4.1.8 with ACL patches and a Kelledin special.
 Tarball can be found at:

ftp://skarpsey.dyndns.org/fileutils-4.1.8acl-kelledin.tar.bz2

Things that might be causing the corruption in our friend
J.Hart's case:

Buggy chipset (damn VIA!!!)
Faulty CPU (heat damage, chipped core?)
Faulty hard drive (hey, it's a DeathStar.)
Faulty IDE controller (if using offboard IDE)
Flaky cable (80-conductor ATA cable doesn't like being folded,
stacked, crumpled, etc., not even slightly)
Buggy IDE driver in the kernel
Buggy filesystem driver
Buggy fileutils
Buggy VM

I can't really test any of the possible software problems,
because I'm all SCSI, all XFS, bleeding-edge fileutils, and
didn't have any really significant swapping going on. There's a
production server I could possibly test it on, but...well...it's
a production machine. Maybe I'll repeat the test a few times
later.

On Wednesday 17 July 2002 10:11 pm, Kelledin wrote:
> This could possibly be a problem with your hard drive.
> Judging from the model number, you have a 45GB IBM DeskStar
> 75GXP, one of the first IBM drives to earn the nickname
> "DeathStar" for its high failure rate. What does IBM's Drive
> Fitness Test tell you?
>
> I'll see about performing your test tonight; I've got a hefty
> little DivX directory I can throw around as I wait for
> j2sdk-1.4.0 to finish compiling. Such a test should be
> sufficient...
>
> This could also be a recurrence of ye olde VIA686B PCI+IDE
> issue. IIRC, some VIA686B motherboards that had that flaw were
> effectively unfixable, simply because certain motherboard
> manufacturers spotted the problem before everyone else (even
> VIA?) and tried their own partial kludge fixes for it. Gotta
> love VIA.

--
Kelledin
"If a server crashes in a server farm and no one pings it, does
it still cost four figures to fix?"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jul 23 2002 - 22:00:25 EST