Re: Possible TCP Problem with RH6.2 talking to Solaris2.6/2.7 (f

From: shane (shane@bratnet.net)
Date: Thu May 04 2000 - 12:11:52 EST


Well the main test I performed was without a switch in the way, this made
it easier to eliminate it as a problem.

But here are some other stuff about the server.:
May 3 14:59:32 newspeer2 kernel: klogd 1.3-3, log source = /proc/kmsg
started.
May 3 14:59:32 newspeer2 kernel: Inspecting /boot/System.map-2.2.14-5.0
May 3 14:59:32 newspeer2 kernel: Loaded 7364 symbols from
/boot/System.map-2.2.14-5.0.
May 3 14:59:32 newspeer2 kernel: Symbols match kernel version 2.2.14.
May 3 14:59:32 newspeer2 kernel: Loaded 32 symbols from 4 modules.
May 3 14:59:32 newspeer2 kernel: Linux version 2.2.14-5.0
(root@porky.devel.redhat.com) (gcc version egcs-2.91.66 19990314/Linux
(egcs-1.1.2 release)) #1 Tue Mar 7 21:07:39 EST 2000
May 3 14:59:32 newspeer2 kernel: relocating initrd image:
May 3 14:59:32 newspeer2 kernel: initrd_start:0xc0f99000
initrd_end:0xc0fff0f8
May 3 14:59:32 newspeer2 kernel: mem_start:0xc026c000
mem_end:0xd4000000
May 3 14:59:32 newspeer2 kernel: initrd_size:0x000660f8
dest:0xd3f99000
May 3 14:59:32 newspeer2 kernel: Detected 199459356 Hz processor.
May 3 14:59:32 newspeer2 kernel: Console: colour VGA+ 80x25
May 3 14:59:32 newspeer2 kernel: Calibrating delay loop... 199.07
BogoMIPS
May 3 14:59:32 newspeer2 kernel: Memory: 322152k/327680k available (1060k
kernel code, 416k reserved, 3576k data, 64k init, 0k bigmem)
May 3 14:59:32 newspeer2 kernel: Dentry hash table entries: 262144 (order
9, 2048k)
May 3 14:59:32 newspeer2 kernel: Buffer cache hash table entries: 524288
(order 9, 2048k)
May 3 14:59:32 newspeer2 kernel: Page cache hash table entries: 131072
(order 7, 512k)
May 3 14:59:32 newspeer2 kernel: VFS: Diskquotas version dquot_6.4.0
initialized
May 3 14:59:32 newspeer2 kernel: CPU: Intel Pentium Pro stepping 09
May 3 14:59:32 newspeer2 kernel: Checking 386/387 coupling... OK, FPU
using exception 16 error reporting.
May 3 14:59:32 newspeer2 kernel: Checking 'hlt' instruction... OK.
May 3 14:59:32 newspeer2 kernel: POSIX conformance testing by UNIFIX
May 3 14:59:32 newspeer2 kernel: mtrr: v1.35a (19990819) Richard Gooch
(rgooch@atnf.csiro.au)
May 3 14:59:32 newspeer2 kernel: PCI: PCI BIOS revision 2.10 entry at
0xf0068
May 3 14:59:32 newspeer2 kernel: PCI: Using configuration type 1
May 3 14:59:32 newspeer2 kernel: PCI: Probing PCI hardware
May 3 14:59:32 newspeer2 kernel: PCI: Device 00:00 not found by BIOS
May 3 14:59:32 newspeer2 kernel: PCI: Device 00:a0 not found by BIOS
May 3 14:59:32 newspeer2 kernel: PCI: 00:00 [8086/1237]: Passive release
enable (00)
May 3 14:59:32 newspeer2 kernel: Linux NET4.0 for Linux 2.2
May 3 14:59:32 newspeer2 kernel: Based upon Swansea University Computer
Society NET3.039
May 3 14:59:32 newspeer2 kernel: NET4: Unix domain sockets 1.0 for Linux
NET4.0.
May 3 14:59:32 newspeer2 kernel: NET4: Linux TCP/IP 1.0 for NET4.0
May 3 14:59:32 newspeer2 kernel: IP Protocols: ICMP, UDP, TCP, IGMP
May 3 14:59:32 newspeer2 kernel: TCP: Hash tables configured (ehash
524288 bhash 65536)
May 3 14:59:32 newspeer2 kernel: Initializing RT netlink socket
May 3 14:59:32 newspeer2 kernel: Starting kswapd v 1.5
May 3 14:59:32 newspeer2 kernel: Detected PS/2 Mouse Port.
May 3 14:59:32 newspeer2 kernel: Serial driver version 4.27 with
MANY_PORTS MULTIPORT SHARE_IRQ enabled
May 3 14:59:32 newspeer2 kernel: ttyS00 at 0x03f8 (irq = 4) is a 16550A
May 3 14:59:32 newspeer2 kernel: ttyS01 at 0x02f8 (irq = 3) is a 16550A
May 3 14:59:32 newspeer2 kernel: pty: 256 Unix98 ptys configured
May 3 14:59:32 newspeer2 kernel: apm: BIOS not found.
May 3 14:59:32 newspeer2 kernel: Real Time Clock Driver v1.09
May 3 14:59:32 newspeer2 kernel: RAM disk driver initialized: 16 RAM
disks of 4096K size
May 3 14:59:32 newspeer2 kernel: hda: MATSHITA CR-583, ATAPI CDROM drive
May 3 14:59:32 newspeer2 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
May 3 14:59:32 newspeer2 kernel: hda: ATAPI 8X CD-ROM drive, 128kB Cache
May 3 14:59:33 newspeer2 kernel: Uniform CDROM driver Revision: 2.56
May 3 14:59:33 newspeer2 kernel: Floppy drive(s): fd0 is 1.44M
May 3 14:59:33 newspeer2 kernel: FDC 0 is a National Semiconductor
PC87306
May 3 14:59:33 newspeer2 kernel: md driver 0.90.0 MAX_MD_DEVS=256,
MAX_REAL=12
May 3 14:59:33 newspeer2 kernel: raid5: measuring checksumming speed
May 3 14:59:33 newspeer2 kernel: 8regs : 337.185 MB/sec
May 3 14:59:33 newspeer2 kernel: 32regs : 190.881 MB/sec
May 3 14:59:33 newspeer2 kernel: using fastest function: 8regs (337.185
MB/sec)
May 3 14:59:33 newspeer2 kernel: scsi : 0 hosts.
May 3 14:59:33 newspeer2 kernel: scsi : detected total.
May 3 14:59:33 newspeer2 kernel: md.c: sizeof(mdp_super_t) = 4096
May 3 14:59:33 newspeer2 kernel: Partition check:
May 3 14:59:33 newspeer2 kernel: RAMDISK: Compressed image found at block
0
May 3 14:59:33 newspeer2 kernel: autodetecting RAID arrays
May 3 14:59:33 newspeer2 kernel: autorun ...
May 3 14:59:33 newspeer2 kernel: ... autorun DONE.
May 3 14:59:33 newspeer2 kernel: VFS: Mounted root (ext2 filesystem).
May 3 14:59:34 newspeer2 kernel: ncr53c8xx: at PCI bus 1, device 9,
function 0
May 3 14:59:34 newspeer2 kernel: ncr53c8xx: 53c875 detected
May 3 14:59:34 newspeer2 kernel: ncr53c875-0: rev=0x03, base=0x40201000,
io_port=0x7000, irq=11
May 3 14:59:34 newspeer2 kernel: ncr53c875-0: ID 7, Fast-20, Parity
Checking
May 3 14:59:34 newspeer2 kernel: ncr53c875-0: on-chip RAM at 0x40200000
May 3 14:59:34 newspeer2 kernel: ncr53c875-0: restart (scsi reset).
May 3 14:59:34 newspeer2 kernel: ncr53c875-0: Downloading SCSI SCRIPTS.
May 3 14:59:34 newspeer2 kernel: scsi0 : ncr53c8xx - version 3.2a-2
May 3 14:59:34 newspeer2 kernel: scsi : 1 host.
May 3 14:59:34 newspeer2 kernel: Vendor: DEC Model: DLT2000 15/30
GB Rev: 840B
May 3 14:59:34 newspeer2 kernel: Type: Sequential-Access
ANSI SCSI revision: 02
May 3 14:59:34 newspeer2 kernel: Compaq SMART2 Driver (v 1.0.6)
May 3 14:59:34 newspeer2 kernel: Found 1 controller(s)
May 3 14:59:34 newspeer2 kernel: cpqarray: Finding drives on ida0
(SMART-2SL)
May 3 14:59:34 newspeer2 kernel: cpqarray ida/c0d0: blksz=512
nr_blks=17764320
May 3 14:59:34 newspeer2 kernel: cpqarray ida/c0d1: blksz=512
nr_blks=53309280
May 3 14:59:34 newspeer2 kernel: ida/c0d0: ida/c0d0p1 ida/c0d0p2 <
ida/c0d0p5 ida/c0d0p6 ida/c0d0p7 ida/c0d0p8 ida/c0d0p9 ida/c0d0p10
ida/c0d0p11 >
May 3 14:59:34 newspeer2 kernel: ida/c0d1: ida/c0d1p1
May 3 14:59:34 newspeer2 kernel: autodetecting RAID arrays
May 3 14:59:34 newspeer2 kernel: autorun ...
May 3 14:59:34 newspeer2 kernel: ... autorun DONE.
May 3 14:59:34 newspeer2 kernel: VFS: Mounted root (ext2 filesystem)
readonly.
May 3 14:59:34 newspeer2 kernel: change_root: old root has d_count=1
May 3 14:59:34 newspeer2 kernel: Trying to unmount old root ... okay
May 3 14:59:34 newspeer2 kernel: Freeing unused kernel memory: 64k freed
May 3 14:59:34 newspeer2 kernel: Adding Swap: 134616k swap-space
(priority -1)
May 3 14:59:34 newspeer2 kernel: st: bufsize 32768, wrt 30720, max
buffers 4, s/g segs 16.
May 3 14:59:34 newspeer2 kernel: Detected scsi tape st0 at scsi0, channel
0, id 6, lun 0
May 3 14:59:34 newspeer2 kernel: eepro100.c:v1.09j-t 9/29/99 Donald
Becker http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
May 3 14:59:34 newspeer2 kernel: eepro100.c: $Revision: 1.18 $ 1999/12/29
Modified by Andrey V. Savochkin <saw@msu.ru>
May 3 14:59:34 newspeer2 kernel: eth0: OEM i82557/i82558 10/100 Ethernet
at 0xd4841000, 00:60:B0:66:84:7F, IRQ 9.
May 3 14:59:34 newspeer2 kernel: Board assembly 673610-001, Physical
connectors present: RJ45
May 3 14:59:34 newspeer2 kernel: Primary interface chip i82555 PHY #1.
May 3 14:59:34 newspeer2 kernel: General self-test: passed.
May 3 14:59:34 newspeer2 kernel: Serial sub-system self-test: passed.
May 3 14:59:34 newspeer2 kernel: Internal registers self-test: passed.
May 3 14:59:34 newspeer2 kernel: ROM checksum self-test: passed
(0x49caa8d6).
May 3 14:59:34 newspeer2 kernel: Receiver lock-up workaround activated.
May 3 14:59:34 newspeer2 kernel: eth1: Intel PCI EtherExpress Pro100 at
0xd4843000, 00:A0:C9:92:06:22, IRQ 5.
May 3 14:59:34 newspeer2 kernel: Board assembly 678400-001, Physical
connectors present: RJ45
May 3 14:59:34 newspeer2 kernel: Primary interface chip i82555 PHY #1.
May 3 14:59:34 newspeer2 kernel: General self-test: passed.
May 3 14:59:35 newspeer2 kernel: Serial sub-system self-test: passed.
May 3 14:59:35 newspeer2 kernel: Internal registers self-test: passed.
May 3 14:59:35 newspeer2 kernel: ROM checksum self-test: passed
(0x49caa8d6).
May 3 14:59:35 newspeer2 kernel: Receiver lock-up workaround activated.

The to fix the full-duplex problem after boot I do:

/usr/local/bin/mii-diag eth1 -F 100baseTX-FD
/usr/local/bin/mii-diag eth0 -F 100baseTX-FD

Also for the Solaris Geeks, just to show the patches on solaris :
SunOS ratbert 5.6 Generic_105181-20 sun4u sparc SUNW,Ultra-2

showrev -a | grep 105529
Patch: 105529-08 Obsoletes: Requires: Incompatibles: Packages: SUNWcsr

I have patched for the below bug, but maybee it is related?
 Patch-ID# 105529-08
 Keywords: security tcp rlogin TCP ACK FIN packet listen
 Synopsis: SunOS 5.6: /kernel/drv/tcp patch
 Date: Sep/22/99

 Topic: SunOS 5.6: /kernel/drv/tcp patch

 BugId's fixed with this patch: 4060583 4083814 4089811 4118528 4128642
4153353
 4155373 4178455

Some info from sunsolve:
Sun fails to ack the FIN from the other system. It is a little subtle,
because
 other system sends data in the packet with the FIN and we ack all the
 data, but not the FIN. Since we do not ack the FIN, he resends it,
 but instead of sending just the FIN, it resends the entire last packet.
 We continue to ignore the FIN. In addition, it is apparent from the
 behavior of the ftpd app, the data that has been received is never
delivered to
 the ftpd.

 It appears that the next to last packet in the ftp-data stream is lost.
 We have received the FIN in the last packet, be do not act on it because
 of the missing segment. We ack what we have and the other system resends
 the missing packet. Now that we have received all the data, we ack all
 the way to the end of the data stream, but we do not ack the FIN.
 After a timeout, the otrher system resends the entire packet, and again
we
 ack the data, as if the FIN was not there. This continues ad nauseam
until
 the whole thing times out and aborts. My guess is that we forget that
the
 first FIN is a FIN when we process it after the retransmit. And then we
 ignore the subsequent packets because we have already received the data
up
 through the end of the packet. We probably toss it without looking for
the
 FIN, on the grounds it only contains data that we have already received,
 i.e. it is a duplicate.

 Here is the relevant portions of the trace: (the sun is 10, the other guy
is
 130)

 time # src->dst Flgs sequence ack length
win
 ------------------------------------------------------------------------
 19.038 119 130->10 PA 1592725955 801669578 1024
4096
 19.06 120 130->10 PA +1024
         801669578 1024 4096
 19.068 121 10->130 A 801669578 1592728003 0
24820
 19.098 122 130->10 PA 1592728003 801669578 1024
4096
         Note: Missing packet. Sequence number of next packet is not the
                 next sequence number.
 19.128 123 130->10 FPA 1592730051 801669578 597
4096
 19.128 124 10->130 A 801669578 1592729027 0
24820
         Note: Next packet is resend of the missing packet. (some
controversy
                 here. Customer thinks it is the packet that has come
                 out of order. Nothing in the ID field, so impossible to
tell)
 19.148 125 130->10 PA 1592729027 801669578 1024
4096
 19.148 126 10->130 A 801669578 1592730648 0
24820
 20.049 127 130->10 FPA 1592730051 801669578 597
4096
 20.049 128 10->130 A 801669578 1592730648 0
24820

 And so on, with the last two packets repeating at increasing intervals.
 Notice that we ack 1592730648 which is (1592729027 + 1024) which is
 the ack of the data. The RFC says that FIN also consumes a sequence
number so
 the last ack should be 1592729028.

 Customer is on 2.5 and sees problem, but says it is also on 2.4

 Full trace info form customer can be found at
/net/network.east/traces/3046397

On Thu, 4 May 2000, Anton Ivanov wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
>
> Can you post the driver part of dmesg and ifconfig info from linux and
> info on ports and stats from switch as well please. Clear the stats on the
> switch before rerunning the experiment as well if possible.
>
> In btw, what is the switch?
>
> Thanks in advance,
>
> - ----------------------------------
> Anton R. Ivanov
> IP Engineer Level3 Communications
> RIPE: ARI2-RIPE E-Mail: Anton Ivanov <aivanov@eu.level3.net>
> @*** The Ultimate Principle ***
> By definition, when you are investigating the unknown - you do not
> know what you will find.
>
> - ----------------------------------
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.0.0 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
>
> iQEVAwUBORGUVSlWAw/bM84zAQE0Xwf7BcnEk2oxjRf7fbapzt5pm6THUM2j1Ehn
> SRh3U0jRtrm5QtlBrGmRf9cH37K4VF/A5ub/0awGcd1P0FMhwNhbaqoG8IcXZ20m
> t2owdHd34lfXFjvFdO/+8/l+yGajYKq8IrKEj+LSy0V3t0yFvx27laZ9ZH9sWtBp
> PVVMOz6qXPXHrilcbvlRj4sCtBVJv3zI7DfpT6/eDlhJaU2LGwya5Z9T1UEy+Xu2
> tra1xlYE2xqZHovA58001mT0Nli79bKm/21CeeFAi8bLJtCQlq4LsJ4mh/51kcir
> jnDkenadFLyXhI7g/IOs/duoVTl/Wd+XsJq4eWYuBPBKfAhc1pr5Yw==
> =He/V
> -----END PGP SIGNATURE-----
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun May 07 2000 - 21:00:15 EST