Re: pre-2.0.31 & network stalls

Manfred Petz (pm@radawana.cg.tuwien.ac.at)
Fri, 6 Jun 1997 10:29:42 +0200 (CEST)


> Manfred Petz <pm@radawana.cg.tuwien.ac.at> writes:
> >
> >The 2.0.30 kernel used to lock up on TCP connections from the
> >Linux side to the SVR4.2. _Very_ often. When I changed to another
> >VC and made a ping(1) to the SVR4.2 box, the connection awakened
> >again. And making a TCP connection (TELNET) from the SVR4.2 box
> >to the Linux machine had also no problems.
> >
> >Under pre-2.0.31 it seemed that this problem has gone. Until now.
>
> I need some clarification here. Was the problem you observed
> between the SVR4.2 box and the linux box over an ethernet wire,
> or over a PPP connection? If over an ethernet wire, what kind
> of hardware is involved.

Between the SVR4.2 and linux the problem was over the ethernet
wire. Linux has two 3c509 and the SVR4.2 an Intel Etherexpress 16.
BNC cabling. I tried it again and again, it seems to be gone with
pre-2.0.31. Too bad, I did no tcpdump with 2.0.30. Our linux server
is very critical, so downtime is bad. I'll try to track it down
with 2.0.30 on sunday night.

There's for sure no irq/port conflict. h/w setup should be o.k. The
ethernet contains only two devices, the linux box and the SVR4.2.

On sunday, I'll try different network cards also. I want to throw
the Etherexpress 16 into the trashcan since I heared that there
are problems with it at least with Linux.

>
> >Connections to our PPP clients lock up from time to time. It seems
> >that connections to the SVR4.2 have no problems. I can't say that for
> >sure, however. It seems that connections tend to lock-up when the
> >PPP client tries to transfer a large amount of data (a big file
> >with FTP while making other short-living connections, like
> >playing around with www. The lockups happen on any PPP modem
> >line and also with only one client connected.
>
> Lookups on the PPP link _sound_ like a broken modem, or a broken
> VJ compression algorithm. Questions:
>
> (1) You're not using US Robotics Sportsers, are you?

Uh... I do... Four 33Kbit USR modems. Is there any known
problem with them???? ...please??

> (2) Try turning off VJ compression. You might be surprised.
> (Broken implementations include such popular systems as
> various releases of Shiva LanRover, Annex, and others....)

Ok. It's now disabled. But the problem occured also between two
linux boxes, both with ppp-2.2.0f.

> >Because the TCP stalls under 2.0.30 and pre-2.0.31 happen
> >between both the SVR4.2 _and_ the PPP clients, it can't be a
> >network hardware related problem.
>
> Don't bet on it. :) There are more hardware problems out there
> than you might imagine. Almost every report I get about TCP freezing
> ends up coming down to broken hardware, or configuration problems.
> (Discounting reports about TCP freezing due to 20-50% packet loss
> rates on the intenet at large. Heck, that's no surprise. The
> surprise is that you manage to start a conversation when loss
> rates are that high.)

after about 200K packets ifconfig(1) still says errors:0, dropped:0
overruns:0.

[snip]

> Yes, you can help. The main things that I would need are
> "tcpdump -n -S -tt" dumps on the interface(s) showing the problem.
> Try to capture entire conversations. More than one if possible.
> Capturing the "ping" effect would be especially interesting.
> A full run down of the link hardware involved is also important.
> Both sides of the link in the case of the PPP link.

ok. on sunday I'll check this out with both 2.0.30 and pre-2.0.31. I tried
about 3 hours yesterday, but the problem seems to vanish if tcpdump is
involved. ...murphy :-(

>
> Seriously, I have no outstanding reports of performance problems with
> TCP that can be linked to a problem in TCP. If you have found such
> a problem I need to know about it in as much detail as possible in
> order to fix it. On the other hand, please make every effort to be
> sure that it really is a TCP problem. There are only so many of us
> working on the network code, and we don't have time to debug everyone's
> configuration/hardware problems.

Ok.

PS: After tuning the linux side a bit, I get 911KB/s on an ftp
transfer between Linux and SVR4.2. :-))

pm