Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled

From: Michal Kubecek
Date: Fri Jun 15 2018 - 05:28:02 EST

Next message: Johannes Berg: "Re: [PATCH v2] bitfield: fix *_encode_bits()"
Previous message: Jiri Olsa: "Re: [PATCH 2/3] perf alias: Rebuild alias expression string to make it comparable"
In reply to: Ilpo Järvinen: "Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled"
Next in thread: Ilpo Järvinen: "Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jun 15, 2018 at 11:05:03AM +0300, Ilpo Järvinen wrote:
> On Thu, 14 Jun 2018, Michal Kubecek wrote:
> > The trace wouldn't look so nice but it can be reproduced even with more
> > data to send. I've copied an example below. I couldn't find a really
> > nice one quickly so that first few retransmits (17:22:13.865105 through
> > 17:23:05.841105) are without new data but starting at 17:23:58.189150,
> > you can see that sending new (previously unsent) data may not suffice to
> > break the loop.
>
> My point was that the new data segment bursts that occur if the sender
> isn't application limited indicate that there's something going wrong
> with FRTO. And that wrong is also what is causing that RTO loop because
> the sender doesn't see the previous FRTO recovery on second RTO. With
> my FRTO undo fix, (new_recovery || icsk->icsk_retransmits) will be false
> and that will prevent the RTO loop.

Yes, it would prevent the loop in this case (except it would be a bit
later, after second RTO rather than after first). But I'm not convinced
the logic of the patch is correct. If I understand it correctly, it
essentially changes "presumption of innocence" (if we get an ack past
what we retransmitted, we assume it was received earlier - i.e. would
have been sacked before if SACK was in use) to "presumption of guilt"
(whenever a retransmitted segment is acked, we assume nothing else acked
with it was received earlier). Or that you trade false negatives for
false positives.

Maybe I understand it wrong but it seems that you de facto prevent
Step (3b) from ever happening in non-SACK case.

> > > No! The window should not update window on ACKs the receiver intends to
> > > designate as "duplicate ACKs". That is not without some potential cost
> > > though as it requires delaying window updates up to the next cumulative
> > > ACK. In the non-SACK series one of the changes is fixing this for
> > > non-SACK Linux TCP flows.
> >
> > That sounds like a reasonable change (at least at the first glance,
> > I didn't think about it too deeply) but even if we fix Linux stack to
> > behave like this, we cannot force everyone else to do the same.
>
> Unfortunately I don't know what the other stacks besides Linux do. But
> for Linux, the cause for the changing receiver window is the receiver
> window auto-tuning and I'm not sure if other stacks have a similar
> feature (or if that affects (almost) all ACKs like in Linux).

The capture from my previous e-mail and some others I have seen indicate
that at least some implementations do not take care to never change
window size when responding to an out-of-order segment. That means that
even if we change linux TCP this way (we might still need to send
a separate window update in some cases), we still cannot rely on others
doing the same.

I checked the capture attached to my previous e-mail again and there is
one thing where our F-RTO implementation (in 4.4, at least) is wrong,
IMHO. While the first ACK after "new data" (sent in (2b)) was a window
update (and therefore not dupack by definition) so that we could take
neither (3a) nor (3b), in some iterations there were further acks which
did not change window size. The text just before Step 1 says

The F-RTO algorithm does not specify actions for receiving
a segment that neither acknowledges new data nor is a duplicate
acknowledgment. The TCP sender SHOULD ignore such segments and
wait for a segment that either acknowledges new data or is
a duplicate acknowledgment.

My understanding is that this means that while the first ack after new
data is correctly ignored, the following ack which preserves window size
should be recognized as a dupack and (3a) should be taken.

Michal Kubecek

Next message: Johannes Berg: "Re: [PATCH v2] bitfield: fix *_encode_bits()"
Previous message: Jiri Olsa: "Re: [PATCH 2/3] perf alias: Rebuild alias expression string to make it comparable"
In reply to: Ilpo Järvinen: "Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled"
Next in thread: Ilpo Järvinen: "Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]