Various errors including TCP bug...

Randy Jay Yarger (randy@hs1.hst.msu.edu)
Fri, 18 Apr 1997 00:14:55 -0400 (EDT)


Howdy, my system is a a PPro 200 with 64MB of RAM and Adaptec SCSI card
(full boot messages below). I've been having odd problems with the system
for a couple of months now, leading up to a complete hard disk curruption
last week (which I would have avoided if I had the newest e2fsck. Get it!)
Any way, the weirdness hasn't stopped, and I want to catch it before my
new root disk gets zapped (and I'm just plain tired of going to reboot it
several times a day).

>From looking at newsgroups, FAQs, etc it seems that RAM may be a problem
and/or my Adaptec SCSI card, or maybe something else that I'm missing
completely. I ran the chipmunk ram test on 16MB of my RAM the other day,
but since the machine is a high volume Web and e-mail server I wasn't able
to shut it down to test all of the RAM.

Here the interesting syslog messages I've observed in the past few days:

Apr 13 17:30:35 h-net2 kernel: free_one_pmd: bad directory entry 00000100

(a lot of 'NFS host not responding' in between these despite the fact that
the NFS server is right next to it and I'm telnetted to it at all times).

Apr 14 11:41:32 h-net2 kernel: free_one_pmd: bad directory entry 00000100
Apr 15 17:01:33 h-net2 kernel: free_one_pmd: bad directory entry 00000100
Apr 15 18:10:00 h-net2 kernel: free_one_pmd: bad directory entry 00000100

Apr 16 14:49:42 h-net2 kernel: kfree of non-kmalloced memory: 00b70810, next= 00000000, order=261
Apr 16 14:49:42 h-net2 kernel: kfree of non-kmalloced memory: 00b70414, next= 00000000, order=261
(A crash happened here)

Apr 16 17:47:45 h-net2 kernel: free_one_pmd: bad directory entry 00000100

Apr 17 12:32:08 h-net2 sendmail[24829]: collect: premature EOM: Error 0
Apr 17 12:33:02 h-net2 kernel: Warning: dev (03:07) tty->count(2) != #fd's(1) in do_tty_hangup
(another crash here)

Apr 17 17:54:08 h-net2 kernel: Lost timer or fin packet in tcp_fin.

Apr 17 21:34:45 h-net2 kernel: TCP: **bug**: copy=0, sk->mss=0
(This is the latest...)

Here is my latest set of boot messages:

Apr 17 14:03:07 h-net2 kernel: Console: 16 point font, 400 scans
Apr 17 14:03:07 h-net2 kernel: Console: colour VGA+ 80x25, 1 virtual console (max 63)
Apr 17 14:03:07 h-net2 kernel: pcibios_init : BIOS32 Service Directory structure at 0x000fccf0
Apr 17 14:03:07 h-net2 kernel: pcibios_init : BIOS32 Service Directory entry at 0xfcd00
Apr 17 14:03:07 h-net2 kernel: pcibios_init : PCI BIOS revision 2.10 entry at 0xfcd21
Apr 17 14:03:07 h-net2 kernel: Probing PCI hardware.
Apr 17 14:03:07 h-net2 kernel: Calibrating delay loop.. ok - 199.07 BogoMIPS
Apr 17 14:03:07 h-net2 kernel: Memory: 63240k/65536k available (712k kernel code, 384k reserved, 1200k data)
Apr 17 14:03:07 h-net2 kernel: Swansea University Computer Society TCP/IP for NET3.034
Apr 17 14:03:07 h-net2 kernel: IP Protocols: ICMP, UDP, TCP
Apr 17 14:03:07 h-net2 kernel: Checking 386/387 coupling... Ok, fpu using exception 16 error reporting.
Apr 17 14:03:07 h-net2 kernel: Checking 'hlt' instruction... Ok.
Apr 17 14:03:07 h-net2 kernel: Linux version 2.0.27 (root@hs6) (gcc version 2.7.2) #5 Thu Apr 10 14:46:54 EDT 1997
Apr 17 14:03:07 h-net2 kernel: hda: WDC AC32500H, 2441MB w/128kB Cache, LBA, CHS=620/128/63
Apr 17 14:03:07 h-net2 kernel: hdb: WDC AC32100H, 2014MB w/128kB Cache, LBA, CHS=1023/64/63
Apr 17 14:03:07 h-net2 kernel: hdc: ST32140A, 2015MB w/128kB Cache, LBA, CHS=4095/16/63
Apr 17 14:03:07 h-net2 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Apr 17 14:03:07 h-net2 kernel: ide1 at 0x170-0x177,0x376 on irq 15
Apr 17 14:03:07 h-net2 kernel: Started kswapd v 1.4.2.2
Apr 17 14:03:07 h-net2 kernel: AHA-2940 (PCI-bus), I/O 0xfc00, Mem 0xfff7f000:
Apr 17 14:03:07 h-net2 kernel: irq 10
Apr 17 14:03:07 h-net2 kernel: bus release time 40 bclks
Apr 17 14:03:07 h-net2 kernel: data fifo threshold 100
Apr 17 14:03:07 h-net2 kernel: SCSI CHANNEL A:
Apr 17 14:03:07 h-net2 kernel: scsi id 7
Apr 17 14:03:07 h-net2 kernel: scsi selection timeout 256 ms
Apr 17 14:03:07 h-net2 kernel: scsi bus reset at power-on enabled
Apr 17 14:03:07 h-net2 kernel: scsi bus parity enabled
Apr 17 14:03:07 h-net2 kernel: scsi bus termination (low byte) enabled
Apr 17 14:03:07 h-net2 kernel: aic7xxx: Resetting the SCSI bus...done.
Apr 17 14:03:07 h-net2 kernel: scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 4.0/3.2/4.0
Apr 17 14:03:07 h-net2 kernel: scsi : 1 host.
Apr 17 14:03:07 h-net2 kernel: scsi0: Target 1, channel A, now synchronous at 10.0MHz, offset 15.
Apr 17 14:03:07 h-net2 kernel: Vendor: SEAGATE Model: ST410800N Rev: 0025
Apr 17 14:03:07 h-net2 kernel: Type: Direct-Access ANSI SCSI revision: 02
Apr 17 14:03:07 h-net2 kernel: Detected scsi disk sda at scsi0, channel 0, id 1, lun 0
Apr 17 14:03:07 h-net2 kernel: scsi0: Target 3, channel A, now synchronous at 5.0MHz, offset 11.
Apr 17 14:03:07 h-net2 kernel: Vendor: EXABYTE Model: EXB-8505 Rev: 0046
Apr 17 14:03:07 h-net2 kernel: Type: Sequential-Access ANSI SCSI revision: 02
Apr 17 14:03:07 h-net2 kernel: scsi : detected 1 SCSI tape 1 SCSI disk total.
Apr 17 14:03:07 h-net2 kernel: SCSI device sda: hdwr sector= 512 bytes. Sectors= 17755614 [8669 MB] [8.7 GB]
Apr 17 14:03:07 h-net2 kernel: Overriding PCI latency timer (CFLT) setting of 64, new value is 255.
Apr 17 14:03:07 h-net2 kernel: eth0: 3Com 3c595 Vortex 100baseTX at 0xff80, 00:20:af:ef:51:56, IRQ 9
Apr 17 14:03:07 h-net2 kernel: Internal config register is 101001b, transceivers 0xe10a.
Apr 17 14:03:07 h-net2 kernel: 64K word-wide RAM 3:1 Rx:Tx split, autoselect/10baseT interface.
Apr 17 14:03:07 h-net2 kernel: 3c59x.c:v0.25 5/17/96 becker@cesdis.gsfc.nasa.gov
Apr 17 14:03:07 h-net2 kernel: Partition check:
Apr 17 14:03:07 h-net2 kernel: sda: sda1 sda2 sda3 sda4
Apr 17 14:03:07 h-net2 kernel: hda: hda1
Apr 17 14:03:07 h-net2 kernel: hdb: hdb1
Apr 17 14:03:07 h-net2 kernel: hdc: [PTBL] [1023/64/63] hdc1 hdc2
Apr 17 14:03:07 h-net2 kernel: VFS: Mounted root (ext2 filesystem) readonly.
Apr 17 14:03:07 h-net2 kernel: Adding Swap: 46364k swap-space
Apr 17 14:03:07 h-net2 kernel: EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
Apr 17 14:03:07 h-net2 kernel: eth0: Initial media type 100baseTX.
Apr 17 14:03:07 h-net2 kernel: eth0: vortex_open() InternalConfig 0141001b.
Apr 17 14:03:07 h-net2 kernel: eth0: vortex_open() irq 9 media status 8802.
Apr 17 14:03:07 h-net2 kernel: eth0: Media selection timer tick happened, 100baseTX.
Apr 17 14:03:07 h-net2 kernel: eth0: Media 100baseTX is has no link beat, 8082.
Apr 17 14:03:07 h-net2 kernel: eth0: Media selection failed, now trying 10baseT port.
Apr 17 14:03:07 h-net2 kernel: eth0: Media selection timer finished, 10baseT.
Apr 17 14:03:08 h-net2 kernel: eth0: Media selection timer tick happened, 10baseT.
Apr 17 14:03:08 h-net2 kernel: eth0: Media 10baseT has link beat, 88c0.
Apr 17 14:03:08 h-net2 kernel: eth0: Media selection timer finished, 10baseT.

Any help would be greatly appreciated!
Randy Jay Yarger | H-Net, Humanities OnLine
randy@yarger.tcimet.net | http://yarger.tcimet.net/