fs corruption w/ 2.0.33, AHA 2940

Oliver Mai (oliver.mai@hamburg.netsurf.de)
Sat, 21 Feb 1998 19:25:06 +0100


Hi!

I've been using Linux since kernel v. 0.99pl15, and now
it's the first time I've had troubles. I'm not sure if
my problems are HW or SW related, but maybe the experts
have an idea.
Setup: plain 2.0.33 UP kernel (i586), 32MB, Adaptec 2940 UW,
2 SCSI hard disks (the outer one actively terminated),
1 SCSI CD-ROM, 1 IDE disk not used by Linux, SB16, ISDN card,
53c810 SCSI controller only for scanner (module not in kernel
at that time)

When the computer had run unattended under
relatively high load (system load about 2) for some hours
the filesystem got completely inconsistent: ls only showed
nonsense in any directory and serveral processes (e.g. xdm)
crashed when they were about to be paged in.
After rebooting, running e2fsck (from e2fsprogs 1.07) manually
repaired most of the files and directories.
I found the following messages in /var/log/messages:

Feb 15 14:46:40 lasse kernel: attempt to access beyond end of device
Feb 15 14:46:40 lasse kernel: 08:01: rw=0, want=1920171312, limit=971901
Feb 15 14:46:43 lasse kernel: attempt to access beyond end of device
Feb 15 14:46:43 lasse kernel: 08:01: rw=0, want=1920171312, limit=971901
Feb 15 14:47:28 lasse kernel: attempt to access beyond end of device
...
Exactly the same message was being repeated for about 3 hours. Then
the other fs started having problems, too:
...
Feb 15 17:40:44 lasse kernel: 08:01: rw=0, want=1920171312, limit=971901
Feb 15 20:37:42 lasse kernel: attempt to access beyond end of device
Feb 15 20:37:42 lasse kernel: 08:12: rw=0, want=1852142698, limit=859477
Feb 15 20:37:42 lasse kernel: attempt to access beyond end of device
Feb 15 20:37:42 lasse kernel: 08:12: rw=0, want=170999819, limit=859477
Feb 15 20:37:42 lasse kernel: attempt to access beyond end of device
Feb 15 20:37:42 lasse kernel: 08:12: rw=0, want=571093825, limit=859477
...
Feb 15 20:41:39 lasse kernel: 08:14: rw=0, want=673789234, limit=1807312
Feb 15 20:41:39 lasse kernel: attempt to access beyond end of device
Feb 15 20:41:39 lasse kernel: 08:14: rw=0, want=824981046, limit=1807312
Feb 15 20:41:39 lasse kernel: attempt to access beyond end of device...

Probably these new messages were triggered by user activity. I rebooted
shortly thereafter.

Can anyone explane what these messages mean? Is it possible to see if
they
most probably were caused by bad termination/cabling or by SW problems?
I've been using the same host adapter for more than a year and the same
HD's since last summer, w/o any probs. I've now checked my HW
by running four concurrent badblocks(8) (as recommended by Doug Ledford)
over a 1.7 GB partitition, but no errors were detected.
Should I install a later version of the aic7xxx driver?

Even now after running e2fsck (both 1.07 and 1.10) there is still
(at least?) one completely inconsistent directory:

# ls -la /opt/applix/axart/borders
br-s---r-- 1 28787 10536 114, 104 Mar 23 2029 bordr003.ag
br-sr-srwx 1 12133 25647 114, 112 Feb 23 1996 bordr004.ag
b--sr-s--t 1 8295 29545 114, 104 May 26 2023 bordr005.ag
c--SrwS--t 1 11827 8250 108, 108 Nov 14 2022 bordr006.ag
br-xr--r-x 1 20580 29299 116, 115 Jan 18 2026 bordr007.ag

I cannot even remove these files:
# rm -f bordr003.ag
rm: bordr003.ag: Operation not permitted

Any hints?

When I scanned through older system logs I found the following messages
from 2 days before the "crash":
Feb 13 20:00:54 lasse kernel: attempt to access beyond end of device
Feb 13 20:00:54 lasse kernel: 08:14: rw=0, want=137271337, limit=1807312
Feb 13 20:00:54 lasse kernel: attempt to access beyond end of device
Feb 13 20:00:54 lasse kernel: 08:14: rw=0, want=1075567977,
limit=1807312
Feb 13 20:00:54 lasse kernel: attempt to access beyond end of device
Feb 13 20:00:54 lasse kernel: 08:14: rw=1, want=137271337, limit=1807312
After that a larger number of \0 characters were printed into the log
file.
Otherwise there were no such messages being logged for at least a year.
I've been running 2.0.33 since december, before that I was running
2.0.29
for a long time.

I'm thankful for any hints. I hope this is not too far off-topic.
Attached you find my machine's boot output with aic7xxx=verbose.

Cheers,
Oliver Mai

-- 
___________________________________
Oliver Mai (Oliver.Mai@Hamburg.netsurf.de, omi@gauss.de)
For information about the moxfm file manager have a look at
http://sugra.desy.de/user/mai/moxfm

Boot output:

Feb 20 09:23:34 lasse kernel: Console: 16 point font, 400 scans Feb 20 09:23:34 lasse kernel: Console: colour VGA+ 80x25, 1 virtual console (max 63) Feb 20 09:23:34 lasse kernel: pcibios_init : BIOS32 Service Directory structure at 0x000fc4e0 Feb 20 09:23:34 lasse kernel: pcibios_init : BIOS32 Service Directory entry at 0xfc8c0 Feb 20 09:23:34 lasse kernel: pcibios_init : PCI BIOS revision 2.00 entry at 0xfc8f0 Feb 20 09:23:34 lasse kernel: Probing PCI hardware. Feb 20 09:23:34 lasse kernel: Calibrating delay loop.. ok - 35.84 BogoMIPS Feb 20 09:23:34 lasse kernel: Memory: 31028k/32768k available (632k kernel code, 384k reserved, 724k data) Feb 20 09:23:34 lasse kernel: Swansea University Computer Society TCP/IP for NET3.034 Feb 20 09:23:34 lasse kernel: IP Protocols: ICMP, UDP, TCP Feb 20 09:23:34 lasse kernel: Checking 386/387 coupling... Ok, fpu using exception 16 error reporting. Feb 20 09:23:34 lasse kernel: Checking 'hlt' instruction... Ok. Feb 20 09:23:34 lasse kernel: alias mapping IDT readonly ... ... done Feb 20 09:23:34 lasse kernel: Linux version 2.0.33 (root@lasse) (gcc version 2.7.2) #2 Sun Dec 21 12:37:24 MET 1997 Feb 20 09:23:34 lasse kernel: Starting kswapd v 1.4.2.2 Feb 20 09:23:34 lasse kernel: ide0: buggy cmd640b interface on PCI (type1), config=0x5e Feb 20 09:23:34 lasse kernel: ide1: not serialized, secondary interface not responding Feb 20 09:23:34 lasse kernel: cmd640: drive0 timings/prefetch(on) preserved Feb 20 09:23:34 lasse kernel: cmd640: drive1 timings/prefetch(on) preserved Feb 20 09:23:34 lasse kernel: hda: Conner Peripherals 540MB - CFA540A, 516MB w/256kB Cache, CHS=1048/16/63 Feb 20 09:23:34 lasse kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Feb 20 09:23:34 lasse kernel: aic7xxx: <Adaptec AHA-294X Ultra SCSI host adapter> at PCI 11 Feb 20 09:23:34 lasse kernel: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 4.1.1/3.2.1 Feb 20 09:23:34 lasse kernel: scsi : 1 host. Feb 20 09:23:34 lasse kernel: (scsi0:0:0:0) Refusing WIDE negotiation; using 8 bit transfers. Feb 20 09:23:34 lasse kernel: (scsi0:0:0:0) Synchronous at 10.0MHz, offset 15. Feb 20 09:23:34 lasse kernel: Vendor: IBM Model: DPES-31080 Rev: S31Q Feb 20 09:23:34 lasse kernel: Type: Direct-Access ANSI SCSI revision: 02 Feb 20 09:23:34 lasse kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 Feb 20 09:23:34 lasse kernel: (scsi0:0:1:0) Refusing WIDE negotiation; using 8 bit transfers. Feb 20 09:23:34 lasse kernel: (scsi0:0:1:0) Synchronous at 20.0MHz, offset 15. Feb 20 09:23:34 lasse kernel: (scsi0:0:1:0) Sending reply SDTR. Feb 20 09:23:34 lasse kernel: Vendor: IBM Model: DCAS-34330 Rev: S61A Feb 20 09:23:34 lasse kernel: Type: Direct-Access ANSI SCSI revision: 02 Feb 20 09:23:34 lasse kernel: Detected scsi disk sdb at scsi0, channel 0, id 1, lun 0 Feb 20 09:23:34 lasse kernel: (scsi0:0:6:0) Refusing WIDE negotiation; using 8 bit transfers. Feb 20 09:23:34 lasse kernel: (scsi0:0:6:0) Synchronous at 10.0MHz, offset 8. Feb 20 09:23:34 lasse kernel: Vendor: TOSHIBA Model: CD-ROM XM-5701TA Rev: 0167 Feb 20 09:23:34 lasse kernel: Type: CD-ROM ANSI SCSI revision: 02 Feb 20 09:23:34 lasse kernel: Detected scsi CD-ROM sr0 at scsi0, channel 0, id 6, lun 0 Feb 20 09:23:34 lasse kernel: scsi : detected 1 SCSI cdrom 2 SCSI disks total. Feb 20 09:23:34 lasse kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors= 2118144 [1034 MB] [1.0 GB] Feb 20 09:23:34 lasse kernel: SCSI device sdb: hdwr sector= 512 bytes. Sectors= 8467200 [4134 MB] [4.1 GB] Feb 20 09:23:34 lasse kernel: Partition check: Feb 20 09:23:34 lasse kernel: sda: sda1 sda2 Feb 20 09:23:34 lasse kernel: sdb: sdb1 sdb2 sdb3 sdb4 Feb 20 09:23:34 lasse kernel: hda: hda1 Feb 20 09:23:34 lasse kernel: VFS: Mounted root (ext2 filesystem) readonly. Feb 20 09:23:34 lasse kernel: Adding Swap: 80320k swap-space (priority -1)

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu