Re: 2.6.20.7 mss negotiation and path mtu discovery mostly broken?
From: Andi Kleen
Date: Wed Apr 25 2007 - 10:13:29 EST
"Ristuccia, Brian" <bristuccia@xxxxxxxxxxxxxxxxxxx> writes:
> I'm seeing a problem where the kernel attempts to send packets with a
> MSS larger than the one negotiated when the TCP connection is
> established. Even after ICMP "can't fragment" messages arrive, the
> kernel still attempts to increase the MSS rather aggressively. The end
> result is extremely poor throughput when sending to a network with a
> smaller MTU.
>
> In /proc/sys/net/ipv4:
> ip_no_pmtu_disc:0
> tcp_mtu_probing:0
>
> The sending host (10.2.10.254) has an MTU of 9000. The destination host
> (12.33.234.69) has an MTU of 1500. There is one router between the hosts
> which will drop packets with the "DF" flag when they don't fit the
> destination interface's MTU and generates the required icmp can't
> fragment message.
>
> The dump shows the initial handshake with correct mss options sent:
>
> 08:39:55.493029 IP 12.33.234.69.35026 > 10.2.10.254.22: S
> 2768979373:2768979373(
> 0) win 5840 <mss 1460,sackOK,timestamp 3873837730 0,nop,wscale 2>
> 08:39:55.493119 IP 10.2.10.254.22 > 12.33.234.69.35026: S
> 963242385:963242385(0)
> ack 2768979374 win 17896 <mss 8960,sackOK,timestamp 413751
The MSS clamp for sending to 10.2.10.254.22 is 8960. MSS is only
one way -- each uses what the other tells it.
> In the following dump, the system eventually gets in a state where it
> oscillates between sendng undeliverable 2896 byte packets and
> deliverable 1448 byte ones.
This should only happen on PMTU expire, which is normally ~15mins.
Perhaps you misconfigured it manually using sysctl.
-And
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/