Data corruption using 2940U

Carsten Gross (carsten@wohnheim.uni-ulm.de)
Sun, 2 Aug 1998 22:32:14 +0200


Hello!

Perhaps this question is not really kernel related, but it's a lowlevel
problem and it _could_ be a serious kernel problem.

I've some problems with data corruption on my SCSI discs connected to an
Adaptec 2940U. Mainboard is a dual Pentium Tyan Tomcat II (HX chipset,
2*P100, 64 MB RAM, Kernel compiles without problems) using SMP Kernel
2.1.111.

It happens during copying from a MO drive 'Fujitsu' Model: M2512A (216 MB)
to a Micropolis SCSI harddrive Model: 4343NS using the Adaptec driver
Version 5.1.0pre4/3.2.4. Ultra support for the Micropolis is enabled, parity
checking on the pci_bus using the LILO commandline option
"aic7xxx=pci_parity" is enabled. SCSI parity checking is also enabled in the
BIOS setup. It's not only one or 2 bits, but a whole disc block of corrupted
data.

First of all I copied a large (100MB) software package stored on the MO with
cp to the hard drive. I did the copy a second time to a different directory
on the same hard drive because of problems with the first copy:

I'll would like to show the corruption pattern right here:
python:/scratch/carsten/win98>ll win98_29.cab
-rw-r--r-- 1 carsten muser 1716224 Aug 2 21:43 win98_29.cab
python:/scratch/carsten/win98>cmp -l win98_29.cab ../win98-2/win98_29.cab |wc -l
508 [Bytes are different, should be the same, copied with cp]
python:/scratch/carsten/win98>cmp win98_29.cab ../win98-2/win98_29.cab
win98_29.cab ../win98-2/win98_29.cab differ: char 697346, line 2757
python:/scratch/carsten/win98>hexdump -s 697346 -n 64 win98_29.cab
00aa402 0ba6 f32d 1a0e 2677 58fb 1d54 00a6 ce04
00aa412 a765 3028 96ae 117c f644 6ac1 4031 57a1
00aa422 9a84 d9f1 355f 45f7 4e3d 287e 9674 34e9
00aa432 d5b5 44de b8ab 5102 8656 08a6 1450 a781 [...]
python:/scratch/carsten/win98>hexdump -s 697346 -n 64 ../win98-2/win98_29.cab
00aa402 dc81 0d4f 992e 412c 61d0 4dc4 9bef a94e
00aa412 b6fa 86b1 efac 97cb a1ce 4a03 4384 d02e
00aa422 2fcb c0c8 49b3 45b9 eb7e 9477 f111 3222
00aa432 0d5e af47 02a7 84eb b826 a431 afb9 48c5 [...]

I'll hold the 'original' and the corrupted file for further examination.
Perhaps a hardware problem during reading/writing? This one happend during
writing, because it's reproducable defecitve, but I've also some 'sporadic'
(read)errors. But how to check for this? The data corruptions size is about
one disc block (508 vs. 512 byte) and the beginning of the problem is really
close to a 512 byte aligned file offset (this would be: 697344). But in my
opinion every simple harddrive has a crc-32 checksum, so a low-level disc
read problem shouldn't be the reason for this (there are no SCSI medium
errors in my logs).

Thanks for your help

Carsten

-- 
Linux, WinNT and MS-DOS. The Good, The Bad and The Ugly
Carsten Gross                     carsten@sol.wohnheim.uni-ulm.de
Wohnheim Heilmeyersteige        Sebastian Kneipp Weg 6, 89075 Ulm

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html