Re: [BUG] TCP connections between Linux and NT

Andi Kleen (ak@muc.de)
Sat, 10 Jul 1999 20:25:35 +0200


[cc'ed to netdev, because the topic of the initial cwnd came just up there.
Seems we got the first victim]

On Sat, Jul 10, 1999 at 06:51:36PM +0200, Petru Paler wrote:
> [1.] One line summary of the problem:
>
> TCP connections between Linux and NT slows down very much (on a local
> network)
>
> [2.] Full description of the problem/report:
>
> Any kind of TCP connections between a Linux (2.2.10) and a NT Server 4
> (Service Pack 5) slow down to a crawl.

[...]
19:21:25.419960 hip.ro.ftp-data > kobold.hip.ro.1280: P 2049:3509(1460) ack 1 wi
n 32120 (DF)
19:21:25.420131 hip.ro.ftp-data > kobold.hip.ro.1280: P 3509:4969(1460) ack 1 wi
n 32120 (DF)
19:21:25.421378 hip.ro.ftp-data > kobold.hip.ro.1280: P 4969:6429(1460) ack 1 wi
n 32120 (DF)
19:21:25.616374 hip.ro.ftp-data > kobold.hip.ro.1280: P 2049:3509(1460) ack 1 wi
n 32120 (DF)
19:21:26.016350 hip.ro.ftp-data > kobold.hip.ro.1280: P 2049:3509(1460) ack 1 wi
n 32120 (DF)
19:21:26.017984 kobold.hip.ro.1280 > hip.ro.ftp-data: . ack 4969 win 8760 (DF)

The NT machine has major problems to get back the ACK on time (it only sends it
three packets later). At first it looks like the Ethernet capture effect (Linux
sending so fast that the NT box doesn't get access to the net and eats lots of
collisons). But if you look at the timestamps you see that Linux leaves enough
gaps between the packets that the NT box should have enough wire to sneak an ACK
in. The slow down is caused because the ack clock is disturbed. Linux 2.2 changed
to a initial congestion window of 2, this makes the problem worse.

Possibilities:
- Linux driver problem (does the packet/ack ratio look similar when you tcpdump
from a third machine?)
- Problem on the NT side. Could you check how many collisions NT counts for the
interface?
- Bad cable.

Workarounds:
- Decrease the window size for the NT machine (route add -host ntmachine window
16384 or smaller) This is really nasty and should be the last resort.
- Switch back to initial cwnd==1 by changing the ->snd_cwnd = 2; assignments line in
net/ipv4/tcp_ipv4.c:tcp_v4_init_sock() and tcp_create_openreq_child() to
snd_cwnd = 1;

Does any of that help?

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/