Re: TCP stack bug related to F-RTO?

From: zhigang gong
Date: Fri Sep 25 2009 - 04:55:45 EST


Oh, I see, so I spoke too quickly in last mail. You just ignore some packets
in the trace. I have analysed the traffic flow and have some findings as below,
hope it's helpful.

>> > 1. The client opens up a big window,
>> > 2. the server sends 19 packets in a row (pkt #14- #32
>> in the trace), but all of them are dropped due to some
>> congestion.
>> > 3. The server hits RTO and retransmits pkt #14 in #33
This retransmission timer expiring indicate the server's tcp/ip
stack to enter slow start mode, as a result we can see the
server's sending window will be reduced to one.

>> > 4. The client immediately acks #33 (=#14), and the
>> server (seems like to enter F-RTO) expends the window and
>> sends *NEW* pkt #35 & #36.=A0 Timeoute is doubled to
>> 2*RTO; The client immediately sends two Dup-ack to #35 and
>> #36.

Server is still in slow start mode, and extend window to 2.

>> > 5. after 2*RTO, pkt #15 is retransmitted in #39.

Here , the second retransmission timer expiring ocur. Server's sending
window reduce to one again and continue in slow start mode.

>> > 6. The client immediately acks #39 (=#15) in #40, and
>> the server continues to expand the window and sends two
>> *NEW* pkt #41 & #42. Now the timeoute is doubled to 4
>> *RTO.
Here you ignore two duplicate acks #37 and #38 sent by the client. As I know
the server must receive three or even more duplcate acks before it enter fast
retransmit mode, otherwise it will still in slow start mode and it
will wait until next
time retransmission timer expiring before retransmit the lost packets.
And this is
actually what you got.

I'm not an kernel expert, I just analyse from the TCP protocol standard. From my
view, I think there is no problem in the server's network stack. But
there maybe
some problem in the client (or some intermediate network appliance) side, as it
always just sends two duplicate acks at the same time, and never send the third
one no matter how long the interval is. In my opinion, if the client
can send the third
duplicate acks then the server will enter fast retransmit mode and
then fast recovery
then every thing will be ok.

>> > 8. After 4*RTO timeout, #16 is retransmitted.
>> > 9....
>> > 10. The above steps repeats for retransmitting pkt
>> #16-#32 and each time the timeout is doubled.
>> > 11. It takes a long long time to retransmit all the
>> lost packets and before that is done, the client sends a RST
>> because of timeout.

On Fri, Sep 25, 2009 at 2:42 PM, Joe Cao <caoco2002@xxxxxxxxx> wrote:
> Hi,
>
> On the wrong tcp checksum, that's because of hardware checksum offload.
>
> As for the seq/ack number, because the trace is long, I deliberately removed those irrelevant packets between after the three-way handshake and when the problem happens.  That can be seen from the timestamps.
>
> Please also note that I intentionally replaced the IP addresses and mac addresses in the trace to hide proprietary information in the trace.
>
> Anyway, the problem is not related to the checksum, or seq/ack number, otherwise, you won't see the behavior shown in the trace.
>
> Thanks,
> Joe
>
> --- On Thu, 9/24/09, zhigang gong <zhigang.gong@xxxxxxxxx> wrote:
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/