Sustained network traffic dies with 2.2.*

Dehartog (dehartog@telekabel.nl)
Wed, 21 Jul 1999 01:19:04 +0200 (MET DST)


Symptoms:
Sustained network traffic ((t)ftp, rcp and ping -f) dies immediately or after
a few seconds. However, not always: a ping -f from a 133MHz laptop to a 350
Mhz server keeps going; the other way around is an immediate screen of dots.
On the Linux side(s), an "ifconfig eth* down up" makes the interface useable
again.
No problems with interactive connections (telnet, xterm).

Major functionality I need: upon bootp requests, transferring images with tftp
to (old) X-terminals (like VXT 2000). See tcpdump at the end of this message.

Most important (IMHO):
---------------------------------------------------------------------------
everything works fine with 2.0.36 (RedHat 5.2) and goes wrong with SuSE 6.1
(2.2.5 and 2.2.7) and all vanilla kernels upto 2.3.9(!)
---------------------------------------------------------------------------

Besides lots of kernels, other things I tried (that did NOT help):

echo "0" > /proc/sys/net/ipv4/tcp_sack
echo "0" > /proc/sys/net/ipv4/tcp_timestamps

switching cables and interfaces (AUI <-> BNC)

rmmod vmware's modules (vmnet and vmmon)

minimize the number of rules for ipchains (i.e. delete all the rules and set
the default policy to ACCEPT)

After reading bad stories about ne2k cards, I replaced all my ne2k-pci cards
with 3Com cards (also pci)

I'm desperate after 2 months trying to solve this. Please help.
Thanks for your time and effort to look into this problem.

Kind regards,
Hans de Hartog

======================
What follows is a tcpdump between a server (10.1.1.5) that receives a bootp
request from a VXT and starts a tftp-transfer to the VXT (10.1.1.11):

tcpdump: listening on eth1
00:45:46.465551 0.0.0.0.68 > 255.255.255.255.67: xid:0x66d30000 [|bootp] (ttl 30, id 2447)
00:45:46.466544 10.1.1.5.67 > 10.1.1.11.68: xid:0x66d30000 Y:10.1.1.11 S:10.1.1.5 [|bootp] (ttl 64, id 20755)
00:45:46.483942 arp who-has 10.1.1.5 tell 10.1.1.11
00:45:46.484116 arp reply 10.1.1.5 is-at 0:60:97:ed:cb:2a
00:45:46.485092 10.1.1.11 > 10.1.1.5: icmp: address mask request (ttl 30, id 2447)
00:45:50.445417 10.1.1.11.54119 > 10.1.1.5.69: 24 RRQ "/tftpboot/VXTEX" (ttl 30, id 2448)
00:45:50.475629 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20789)
00:45:50.517352 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2449)
00:45:50.518184 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20790)
00:45:50.526071 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2450)
00:45:50.526882 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20792)
00:45:50.532799 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2451)
00:45:50.533571 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20793)
00:45:50.539425 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2452)
00:45:50.540207 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20794)
00:45:50.546123 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2453)
00:45:50.546921 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 20796)
00:45:50.552775 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 2454)

<about 3260 lines like these>

00:46:01.438798 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 22516)
00:46:01.444655 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4082)
00:46:01.445432 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 22517)
00:46:01.451348 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4083)
00:46:01.452123 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 22518)
00:46:01.457978 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4084)
00:46:01.458754 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 22519)
00:46:01.464670 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4085)
00:46:01.465444 10.1.1.5.1028 > 10.1.1.11.54119: udp 516 (ttl 64, id 22520)
00:46:01.471300 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4086)

<here the transfer died after 60-70% of the 1.2Mb image had been done>

00:46:01.471561 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:02.470769 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:03.470778 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:06.470987 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:07.470837 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:08.470855 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:11.471000 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:12.470952 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:12.472065 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4087)
00:46:13.470921 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:17.471082 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:18.470984 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:19.470999 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:22.471149 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:23.471050 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:23.472172 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4088)
00:46:24.471064 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:36.346859 10.1.1.11.54119 > 10.1.1.5.1028: udp 4 (ttl 30, id 4089)
00:46:36.347082 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:37.341242 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:38.341254 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:50.218229 10.1.1.11.54119 > 10.1.1.5.1028: udp 5 (ttl 30, id 4090)
00:46:50.218440 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:50.235716 snap 8:0:2b:60:1 8:0:2b:2c:3b:96 > ab:0:0:1:0:0 sap aa ui/C

<from here on, the VXT tries other protocols like MOP V3 and MOP V4>

>>> Unknown IPX Data: (42 bytes)
[000] 02 06 56 58 54 4C 44 52 00 01 00 03 04 00 00 02 ..VXTLDR ........
[010] 00 02 59 00 07 00 06 08 00 2B 2C 3B 96 64 00 01 ..Y..... .+,;.d..
[020] 49 90 01 01 01 91 01 02 06 04 I....... ..
len=42
00:46:51.211434 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:52.211443 arp who-has 10.1.1.11 tell 10.1.1.5
00:46:56.170659 snap 8:0:2b:60:1 8:0:2b:2c:3b:96 > ab:0:0:1:0:0 sap aa ui/C
>>> Unknown IPX Data: (42 bytes)
[000] 02 06 56 58 54 4C 44 52 00 01 00 03 04 00 00 02 ..VXTLDR ........
[010] 00 02 59 00 07 00 06 08 00 2B 2C 3B 96 64 00 01 ..Y..... .+,;.d..
[020] 49 90 01 01 01 91 01 02 06 04 I....... ..
len=42

3322 packets received by filter
0 packets dropped by kernel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/