3C509 Network Freezes -- seems new in 2.2.10

Robert Gash (gashalot@cybermax.net)
Tue, 06 Jul 1999 23:37:00 +0000


I have a P100 Linux machine at my house that acts as a firewall/gateway
to the Internet for my home LAN (IP masquerading). I have a 2-way
cablemodem from Mediaone, therefore I must use DHCP to obtain an IP
address. I am currently using the dhclient (ISC's new DHCP-v2 official
client) to get the IP address, and the DHCP client and server seem to
work just fine.

Most of the time it seems to work fine, but I have been recently running
into a problem that the network goes into a state of REALLY bad latency
(pings from good and bad times are below). The time period that it
takes before it breaks seems to be completeley random and there dosen't
appear to be a thing that I can do to fix this (there is no weird system
activity [high cpu load, etc] running at these times, and noone is
attempting to DoS the box either, as eth0 is silent for all traffic). I
have found that taking the interface offline and re-running the dhclient
program works fine and the net is once again reachable and operates at a
good clip all of the time. The problem is also not on MediaOne's end,
as I tried it with my other NIC and it appears to be stable. I will
reiterate that this is ALWAYS Fixed IMMEDIATLEY by bringing eth0 down
then re-running the dhclient software (brings eth0 back up), so I can
bet it's not M1 (I always have normal operation lights on the cablemodem
when this happens). I have checked all cables and this does not appear
to be a hardware issue.

This appears to be a bug in 2.2.10 as far as I can see (I have tried
both with 2.2.10 stock and 2.2.10-ac8. I can't really track down the
problem that could be occuring here, but when just tinkering around I
found that a ping -f to my primary nameserver at work seems to alleviate
this problem (but I should note that running a ping -f is a bad way to
fix things, and I only do it to machines that I know can take it and
that I manage for my office). It is also worth noting that when I watch
the pings (to catch the problem) that they grow progressivley longer
(30ms, 50ms, 100ms, 300ms, 500ms, 900ms, 1300ms, 3000ms, 15000ms,
30000ms, etc.).

Does *ANYONE* have any idea what could be wrong? I have posted below a
brief summary of my hardware driver revisions for those who are
interested. At the very end is sample output from two timeperiods
(about 30s apart) before and after I bring the nic up and down.

AST Bravo LC P/100 (P100, 40MB EDO RAM, 3.8 GB [2hd's] storage)
3Com 3C509 (10bT-only model)
Linksys Etherfast 10/100 (Lite-On PNIC using tulip.c)

Of course the 3C509 is ISA and the Linksys is PCI. I use IP Masqing and
all of the modules (including ip_masq_icq) all of the time. The Linksys
is my internal NIC and the external is the 3c509. The 3c509 driver is
compiled into the kernel, and the tulip is loaded as a module (so it
always comes up as eth1).

For those who made it this far, I did a dupe-post to linux-kernel and
linux-net because I feel it is related to both (or at least it seems
that way).

eth0: 3c509 at 0x300 tag 1, 10baseT port, address 00 60 08 10 bd 16,
IRQ 10.
3c509.c:1.16 (2.2) 2/3/98 becker@cesdis.gsfc.nasa.gov.

tulip.c:v0.91 4/14/99 becker@cesdis.gsfc.nasa.gov
eth1: Lite-On 82c168 PNIC rev 32 at 0xfc00, 00:A0:CC:28:9C:7A, IRQ 11.
eth1: MII transceiver #1 config 3100 status 782d advertising 01e1.

uname -a:
Linux home 2.2.10-ac8 #2 Sun Jul 4 00:25:39 EDT 1999 i586 unknown

Problem:
64 bytes from 207.19.133.3: icmp_seq=31 ttl=242 time=2184.2 ms
64 bytes from 207.19.133.3: icmp_seq=32 ttl=242 time=3054.4 ms
64 bytes from 207.19.133.3: icmp_seq=33 ttl=242 time=3042.4 ms
64 bytes from 207.19.133.3: icmp_seq=34 ttl=242 time=3157.2 ms
64 bytes from 207.19.133.3: icmp_seq=35 ttl=242 time=4034.3 ms
64 bytes from 207.19.133.3: icmp_seq=36 ttl=242 time=4040.1 ms

Normally the network operates like this:
64 bytes from 207.19.133.3: icmp_seq=53 ttl=242 time=36.4 ms
64 bytes from 207.19.133.3: icmp_seq=54 ttl=242 time=29.1 ms
64 bytes from 207.19.133.3: icmp_seq=55 ttl=242 time=30.3 ms
64 bytes from 207.19.133.3: icmp_seq=56 ttl=242 time=30.7 ms
64 bytes from 207.19.133.3: icmp_seq=57 ttl=242 time=28.7 ms
64 bytes from 207.19.133.3: icmp_seq=58 ttl=242 time=31.7 ms

-- 
Robert Gash                  |    _____     __
Systems Administrator        |   / ___/_ __/ /  ___ ______ _  ___ ___ __
Phone: (904) 281-2200 x3312  |  / /__/ // / _ \/ -_) __/  ' \/ _ `/\ \ /
Fax: (904) 296-4203          |  \___/\_, /_.__/\__/_/ /_/_/_/\_,_//_\_\
gashalot@cybermax.net        |      /___/ , Inc.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/