Re: [net PATCH] tcp: Mark fastopen SYN packet as lost when receiving ICMP_TOOBIG/ICMP_FRAG_NEEDED
From: Eric Dumazet
Date:  Fri Dec 11 2020 - 14:19:38 EST
On Fri, Dec 11, 2020 at 6:15 PM Alexander Duyck
<alexander.duyck@xxxxxxxxx> wrote:
>
> On Fri, Dec 11, 2020 at 8:22 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > On Fri, Dec 11, 2020 at 5:03 PM Alexander Duyck
> > <alexander.duyck@xxxxxxxxx> wrote:
> >
> > > That's fine. I can target this for net-next. I had just selected net
> > > since I had considered it a fix, but I suppose it could be considered
> > > a behavioral change.
> >
> > We are very late in the 5.10 cycle, and we never handled ICMP in this
> > state, so net-next is definitely better.
> >
> > Note that RFC 7413 states in 4.1.3 :
> >
> >  The client MUST cache cookies from servers for later Fast Open
> >    connections.  For a multihomed client, the cookies are dependent on
> >    the client and server IP addresses.  Hence, the client should cache
> >    at most one (most recently received) cookie per client and server IP
> >    address pair.
> >
> >    When caching cookies, we recommend that the client also cache the
> >    Maximum Segment Size (MSS) advertised by the server.  The client can
> >    cache the MSS advertised by the server in order to determine the
> >    maximum amount of data that the client can fit in the SYN packet in
> >    subsequent TFO connections.  Caching the server MSS is useful
> >    because, with Fast Open, a client sends data in the SYN packet before
> >    the server announces its MSS in the SYN-ACK packet.  If the client
> >    sends more data in the SYN packet than the server will accept, this
> >    will likely require the client to retransmit some or all of the data.
> >    Hence, caching the server MSS can enhance performance.
> >
> >    Without a cached server MSS, the amount of data in the SYN packet is
> >    limited to the default MSS of 536 bytes for IPv4 [RFC1122] and 1220
> >    bytes for IPv6 [RFC2460].  Even if the client complies with this
> >    limit when sending the SYN, it is known that an IPv4 receiver
> >    advertising an MSS less than 536 bytes can receive a segment larger
> >    than it is expecting.
> >
> >    If the cached MSS is larger than the typical size (1460 bytes for
> >    IPv4 or 1440 bytes for IPv6), then the excess data in the SYN packet
> >    may cause problems that offset the performance benefit of Fast Open.
> >    For example, the unusually large SYN may trigger IP fragmentation and
> >    may confuse firewalls or middleboxes, causing SYN retransmission and
> >    other side effects.  Therefore, the client MAY limit the cached MSS
> >    to 1460 bytes for IPv4 or 1440 for IPv6.
> >
> >
> > Relying on ICMP is fragile, since they can be filtered in some way.
>
> In this case I am not relying on the ICMP, but thought that since I
> have it I should make use of it. WIthout the ICMP we would still just
> be waiting on the retransmit timer.
>
> The problem case has a v6-in-v6 tunnel between the client and the
> endpoint so both ends assume an MTU 1500 and advertise a 1440 MSS
> which works fine until they actually go to send a large packet between
> the two. At that point the tunnel is triggering an ICMP_TOOBIG and the
> endpoint is stalling since the MSS is dropped to 1400, but the SYN and
> data payload were already smaller than that so no retransmits are
> being triggered. This results in TFO being 1s slower than non-TFO
> because of the failure to trigger the retransmit for the frame that
> violated the PMTU. The patch is meant to get the two back into
> comparable times.
Okay... Have you studied why tcp_v4_mtu_reduced() (and IPv6 equivalent)
code does not yet handle the retransmit in TCP_SYN_SENT state ?