Re: Data corruption

From: Jeff Garzik
Date: Tue Aug 07 2007 - 18:12:09 EST


paul wrote:
Since 2-3 month I have some random data corruption on my Linux server, after checking disks independently (i'm using raid1on 2 sata disk, the problem is the same w/o raid) and memory, hardware simce to be out of cause...

Here is my problem:
=> head --bytes=300m /dev/urandom > test
=> for i in `seq 0 9` ; do cp test test$i ; done
=> md5sum test*
I got : 014666c728c9e3b8299579fae499864a test
014666c728c9e3b8299579fae499864a test0
333fd93d093ac612cd8d5f65628f734e test1
1ab6ee68c6a7d9ff5a05f9d63f0f6df6 test2
96e96483e3175a59c9c05b6720514e1e test3
014666c728c9e3b8299579fae499864a test4
b24dbccc9f4831f8825ab4a55a3be4aa test5
8493efc9c14e4b5c162ac23696fbc16a test6
6a5f4301f66d0379049d79d0e14e2a87 test7
2c81cfa1c3a03aba134574922ee5d75c test8
2ea15c8392bfd0123472a80125bb3abe test9

^^^ that sounds really bad for my data :(

===================================================================
I did some tests :
* badblocks on the two disk with ro and rw tests => report no error
* memtest during 6 hours => report no error

* I reproduces the error
- under xen client host (first time issue)
- under xen hypervisor
- under basic kernel with raid mirroring + ext3 and raiserfs
- under basic kernel w/o raid but ext3 ans reiserfs

My configuration
* Asus P5B-VM
* 4 Gb [try with and w/o options memory remaping]
* Intel Core 2 Duo [normal speed and underclocked(233 bus speed)]
* Hd SATA WD 80Gb

Corruption with which controller? pata_jmicron? ata_piix? ahci?

Can you reproduce with 2.6.23-rc2? If not, please report the bug to OpenSuSE, since we only support unmodified vanilla kernels here.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/