I have investigated this problem and found that read errors occur with a
386 starting with the 1.3.94 patch and all subsequent kernels thru 2.0.23.
I say that only read errors occurred since files I wrote that seemed bad
before where actually good when read with a kernel 1.3.93 or smaller.
(They were written via NFS over my network from the 586).
---------- Forwarded message ----------
Date: Sun, 13 Oct 1996 20:59:22 -0400 (EDT)
From: Don Parsons <dparsons@synapse.kent.edu>
To: linuxkernel <linux-kernel@vger.rutgers.edu>
Subject: 386 R/W filesys errors (but not msdos_fs)
My old PC has run Linux for 4 years, but in May 1996 time frame (first
noticed when trying to install 312E XFree86) I get bytes swapped at a few
places that are divisible by 4k, or all bytes pairs are swapped for a
12k block in larger files over half to one MB size. For instance a
hexdump diff of the 312GS3.tgz file:
--- S3 Thu Oct 10 21:18:52 1996
+++ S3-err Thu Oct 10 21:19:46 1996
@@ -63741,775 +63741,775 @@
00f8fc0 c8aa 3308 40b1 330f c521 11dc 338e 8951
00f8fd0 bcee 1334 84ce 9dde 4baf e5f3 f87c 1312
00f8fe0 b5b1 a4d0 1e40 9c13 94c7 63c2 7d22 b20a
-00f8ff0 0169 6e38 f1a6 48c5 f666 1945 ca57 a38e <= this a3 lost
-00f9000 410e 602b 587c edd0 477f e0e5 a2a5 4ac0
-00f9010 1cb9 6dfc 5a0a 9615 9783 011d 6d21 f350
.
.
.
-00fbfe0 30f3 80e7 54b0 08aa 58e1 4fa7 58d5 6fbc
-00fbff0 d616 6b78 0ef8 6f9c 9d40 42b7 a0e6 8b49
+00f8ff0 0169 6e38 f1a6 48c5 f666 1945 ca57 8e8e <= note 8e dupped on a3
+00f9000 0ea3 2b41 7c60 d058 7fed e547 a5e0 c0a2
+00f9010 b94a fc1c 0a6d 155a 8396 1d97 2101 506d
.
.
.
+00fbfd0 ce9c 5e65 8309 4f1d 0612 d939 031f 4f38
+00fbfe0 f3a7 e730 b080 aa54 e108 a758 d54f bc58
+00fbff0 166f 78d6 f86b 9c0e 406f b79d e642 49a0
00fc000 0b9d 72f9 0680 5c1c 2dad 9fc2 a4b3 f347
00fc010 fbe8 001c 2449 f2cf 7865 5be6 8c36 e243
00fc020 d1d9 bf50 d255 2522 b3d2 9c72 7f3b a3ee
@@ -78333,775 +78333,775 @@
0131fc0 1bb1 bb48 b064 5d9a 08b4 3c4d ba91 7d3a
0131fd0 92cb 85b1 e51c ed44 6822 2e32 e637 aa9f
0131fe0 c32e 3c0c caac db23 7f93 e628 95e0 8473
-0131ff0 445c dbd3 69c7 2fbb 7991 ebaa 28b1 0f9d
-0132000 323f 4199 b216 0856 0243 6660 ad0f e9a2
-0132010 949f e841 59a8 7d1f d242 1e62 f5dc 9e2c
.
.
.
-0134fd0 2190 7416 b235 21a9 5d6b aae7 9d66 0e59
-0134fe0 4c47 c266 b6d1 9382 d9c4 f9be d5c8 4ae2
-0134ff0 e3fb 1325 65a1 f8d5 c097 f2b1 0651 4b8f
+0131ff0 445c dbd3 69c7 2fbb 7991 ebaa 28b1 9d9d
+0132000 3f0f 9932 1641 56b2 4308 6002 0f66 a2ad
+0132010 9fe9 4194 a8e8 1f59 427d 62d2 dc1e 2cf5
.
.
.
+0134fd0 90e6 1621 3574 a9b2 6b21 e75d 66aa 599d
+0134fe0 470e 664c d1c2 82b6 c493 bed9 c8f9 e2d5
+0134ff0 fb4a 25e3 a113 d565 97f8 b1c0 51f2 8f06
0135000 55c3 6237 375d 45e7 38d9 81f6 5a3a 93aa
0135010 174c fd16 c3cb b3ed 102a bad4 1aa2 a380
0135020 575c 0e84 5249 d94e 70e7 5b19 2f26 ddb3
.
.
.
....no more errs to end of file (312GS3.tgz)...
0137d20 fdfc f7f3 dfcf 2fff ff7f e71f 61e6 00e6
0137d30 2f30 0000
0137d33
-----------------------
I had thought it was a hardware problem, but I just discovered that I can
use the msdos-fs totally error free. I copied the entire 312G
distribution to /dos/d/sf312/ twice and gzip -tv *z gave no error at
all. But almost all files over 750K on ext2 where bad on a 4K
boundaries for a 12K block.
They are also bad on the minix fs. Here I once noticed a copy of vmlinux
only had two errors, like at 54*4K and 71*4K each a one byte error and
not a 12K error block following. (The following byte had clobbered the
byte preceding it).
I have had no errors on my 586 systems.
The 386 has an ESDI disk with the Linux swap partition (I turned it off
and errors continued). I have a 1.23GB disk with the ext2, minix, and
msdos file systems. It has a Ultrastor14F SCSI controller. Video ET4000,
8MB RAM, Cyrix 387-33, Intel 386DX-33, SMC Ultra, and always the latest
kernel--2.0.22). No kerneld or sound. Was SLS 0.98p1 upgraded to ELF a
year ago. (I tried both Ultrastor drivers in 2.0.23 and they gave the same
error problems).
Does anybody use a 386 any more? Is there an error in the kernel with 386
PCs? (I am now, 10-27-96, pretty sure there is a kernel bug introduced
with patch 1.3.94 that causes 386 problems. It also seems that very few
people use 386s as I got no comments on the problem from my first
message).
I just did this: gzip -tv 312gfnon.tgz
and got many of these:
08:04: rw=0, want=5875201, limit=385056
attempt to access beyond end of device
08:04: rw=0, want=5875457, limit=385056
attempt to access beyond end of device
08:04: rw=0, want=5875713, limit=385056
attempt to access beyond end of device
because of the corrupt file.
Since I didn't use the 386 as much as the 586 I can only give a rough
estimate as to when the error(?) was introduced: 1.3.80--pre2.0.10 [I now
know it is the 1.3.94 patch].
Ideas? I'm quite willing to help track the bug.
-- Don Parsons