Re: mss negotiation and path mtu discovery mostly broken?

From: Andi Kleen
Date: Wed Apr 25 2007 - 10:13:29 EST

"Ristuccia, Brian" <bristuccia@xxxxxxxxxxxxxxxxxxx> writes:

> I'm seeing a problem where the kernel attempts to send packets with a
> MSS larger than the one negotiated when the TCP connection is
> established. Even after ICMP "can't fragment" messages arrive, the
> kernel still attempts to increase the MSS rather aggressively. The end
> result is extremely poor throughput when sending to a network with a
> smaller MTU.
> In /proc/sys/net/ipv4:
> ip_no_pmtu_disc:0
> tcp_mtu_probing:0
> The sending host ( has an MTU of 9000. The destination host
> ( has an MTU of 1500. There is one router between the hosts
> which will drop packets with the "DF" flag when they don't fit the
> destination interface's MTU and generates the required icmp can't
> fragment message.
> The dump shows the initial handshake with correct mss options sent:
> 08:39:55.493029 IP > S
> 2768979373:2768979373(
> 0) win 5840 <mss 1460,sackOK,timestamp 3873837730 0,nop,wscale 2>
> 08:39:55.493119 IP > S
> 963242385:963242385(0)
> ack 2768979374 win 17896 <mss 8960,sackOK,timestamp 413751

The MSS clamp for sending to is 8960. MSS is only
one way -- each uses what the other tells it.

> In the following dump, the system eventually gets in a state where it
> oscillates between sendng undeliverable 2896 byte packets and
> deliverable 1448 byte ones.

This should only happen on PMTU expire, which is normally ~15mins.
Perhaps you misconfigured it manually using sysctl.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at