Re: [2.2.9] performance problem on 100MBit network

cybertech@cybertech.org
Wed, 02 Jun 1999 02:42:15 GMT


Hi guys, let me add my 2 cents to this issue.

I've got 3 machines running 2.2.9. The problem below started when i
upgraded to 2.2.9. 2 of the machine are PPro200, 1 is PII 350. All are
running 3COM 3C905/905b PCI controllers -- 5 of them in total, 2 running at
10mb, 3 at 100mb.

After approximately 2 days of uptime, I will start to see ping times on the
local lan jump to 7-20 seconds. This occurs whether i'm pinging another
linux box, a windows box, or a cisco. I can temporarily fix the problem
with the following rc.inet1 script -- it lasts for about an hour.

It seems to be dependant upon the network load -- lighter loads lead to
longer periods between problems. The problem ALSO is gradual -- it'll
start at 4 seconds, then 7 about 20 minutes later, than 30 minutes later
it's up to 12-20 seconds. At this point the machine is almost unusable
from remote. The problem doesn't seem to be triggered by data being
forwarded -- I can route as much as i want thru the firewall (linux) w/o
affecting it, but if the same machine is used for server activities it does
begin to display the problem.

My kernel is compiled w/gcc version egcs-2.91.66 19990314 (egcs-1.1.2
release)
I normally compile the kernel w/ -mpentiumpro, but have gone back to
-mpentium for the time being -- that did not change the problem at all.
The network consists of a handful of windows boxes (98, NT4, Windows 2000),
a cisco 2524, and the linux boxes. Nothing is dhcp enabled, ALL nics in
all machine are 3com based, and i'm all out of relevant information. :)

All machines are using glibc 2.1, soon 2.1.1 -- if that changes anything
i'll let you know.

I've added the following script to cron run hourly, it's solved the problem
for now, but is by no means optimal. Just wanted others out there to know
it wasn't necessary to reboot.

#! /bin/sh
#
# rc.inet1 This shell script boots up the base INET system.
#
# Version: @(#)/etc/rc.d/rc.inet1 1.97 05/29/99
#

HOSTNAME=`cat /etc/HOSTNAME`

/sbin/ifconfig lo down
/sbin/ifconfig eth0 down
/sbin/ifconfig eth1 down
/sbin/ifconfig eth2 down
sleep 5
/sbin/ifconfig lo 127.0.0.1
/sbin/ifconfig eth0 x.x.x.x netmask x.x.x.x

.... continue setting up interfaces, etc...

David S. Stahl
President, CTCS Inc.
cybertech@cybertech.org


Jochen Heuer
<jogi@planetzork.ping.de> To: Rogier Wolff <R.E.Wolff@BitWizard.nl>
Sent by: cc: "David S. Miller" <davem@redhat.com>, sim@netnation.com,
owner-linux-kernel@vger.r linux-kernel@vger.rutgers.edu, (bcc: CyberTech/CTCS)
utgers.edu Subject: Re: [2.2.9] performance problem on 100MBit network


05/27/99 12:41 PM


On Thu, May 27, 1999 at 03:11:08PM +0200, Rogier Wolff wrote:
> David S. Miller wrote:
> > Date: Sat, 22 May 1999 16:08:30 +0200
> > From: Jochen Heuer <jogi@planetzork.ping.de>
> >
> > Sometimes I get ping times of up to 20 seconds(!) while normaly
> > ping times are ~0.3ms. If this happens I have to reboot either my
> > server or my client.
> >
> > What can I do to debug this if it happens again?
> >
> > Let me guess, you see this behavior on systems where DHCP is running
> > in some way? And it happens about when a client rebinds, and the
> > machine in question is going ARP crazy if you watch it on the network
> > with tcpdump?
>
> Another possibility is that he has an RTL8139 card in the machine.
> There is an obscure bug in that driver that causes the card to get
> a certain "backlog". One packet is transmitted for every packet
> received.
>
> If that's the case, Jochen will see that if he starts a parallel ping
> to the same machine on a different xterm, the ping-times will drop to
> 10s.
>
> I haven't had time to debug this yet. (and stupid me, the machine has
> gone into production :-( )
>
> > Did I guess right?
>
> Did I guess right?

Nope, sorry. But it is an SMC EtherPower II. How can I debug this if
it happens? I'm not so familiar with tcpdump :( And I have not seen
this problem for days now. I'm using 2.2.9 on my server (Dual P133)
and 2.3.3 on my client (K6-2 300). Both machines are equipped with
a SMC EtherPower II network card.

Regards,

Jogi

--

Well, yeah ... I suppose there's no point in getting greedy, is there?

<< Calvin & Hobbes >>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/