2.4.1-pre8 losing pages

From: Peter Horton (pdh@colonel-panic.com)
Date: Thu Jan 25 2001 - 18:16:59 EST


I'm experiencing repeatable corruption whilst writing large volumes of
data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
for 10 hours).

First, I realised that the fsck was noticing small corruptions on my ext2
volume. My first suspect was the much discussed VIA IDE controller. As a
test I created a 128M file from "urandom" and copied it to twenty six
files. When I MD5 the files one or two of them are usually corrupt. The
damage usually occurs in the 24th copy (thought not always). Inspecting
the files shows a single 4K block (aligned on a 4K boundary) that is
completely different from what it should be. The kernel logs no errors
whilst writing the corrupt files.

I've repeated the test on the other on-board IDE controller (Promise), a
different hard disk, and on reiserfs. I see the corruption in all cases.

I tried building the kernel for "Pentium-Classic", and I tried a few older
kernels (2.4.0-test5 and 2.4.0-test12), still bad (all kernels built with
GCC 2.95.2 - Debian potato).

I really could do with some help as where to look next :-). I did try and
come up with a test to see whether bad data is written or whether the
damaged piece is just not written, but if I alter the testing procedure
too much the problem seems to go away. It seems to just lose a single page
under one very specific circumstance.

P.

( configs attached )



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jan 31 2001 - 21:00:23 EST