Re: [patch] TCP/IP delacks disabled with MPI

Andrea Arcangeli (andrea@suse.de)
Mon, 24 May 1999 14:33:58 +0200 (CEST)


On Mon, 24 May 1999, Andi Kleen wrote:

>decremented by acks (and coincidentely we already have such a counter -
>tp->packets_out / tcp_packets_in_flight). The send queue check is far
>too costly.

The packets_out and tcp_packets_in_flight and the send queue has nothing
to do with the receiver delayed-acks. I am changing the receiver not the
sender. The information I want is to know how many not-yet-acked packets
are in the receive queue. The browse of the receive list will run in O(1)
most of the time and looks a robust way to get such info.

>But the trace shows that the problem is that sender doesn't get out of
>slow start quickly enough because of the delacks. Now we already have

No. The performance drop due breaking RFC1122 has nothing to do with the
sender. Maybe also the sender should incrase cwnd by more than 1 but
that's a different issue (and reading RFC it seems that the sender is ok
by increasing cwnd only by 1).

>a mechanism in 2.2 to work around that - quick ack mode. It gets enabled

quick ack mode is not a mechanizm to work around that. It's just that when
we got an OOO segment we want to ack immediatly the received packet to
update the sender about our current rcv_nxt.

quick ack is needed also for generating the first ack immediatly after the
three way handshake.

>Currently it only acks quickly a single packet (I believe that was been
>done to optimize HTTP mainly), but that could be extended without too

quickack is not only to ack fast the first packet after the three way
handshake.

>So I believe the right fix is to add a new sysctl that enlarges quick ack
>mode to more packets.

Could you show me how this trace will continue with your fix applyed?

18:49:12.911613 localhost.1155 > localhost.3333: S 3745394879:3745394879(0) win 31072 <mss 3884,sackOK,timestamp 1616487 0,nop,wscale 0> (DF)
18:49:12.911708 localhost.3333 > localhost.1155: S 3746598727:3746598727(0) ack 3745394880 win 31072 <mss 3884,sackOK,timestamp 1616487 1616487,nop,wscale 0> (DF)
18:49:12.911751 localhost.1155 > localhost.3333: . ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
18:49:12.911839 localhost.1155 > localhost.3333: P 1:2(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
18:49:12.911897 localhost.3333 > localhost.1155: . ack 2 win 31071 <nop,nop,timestamp 1616487 1616487> (DF)
18:49:12.911934 localhost.1155 > localhost.3333: P 2:3(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
18:49:12.911966 localhost.1155 > localhost.3333: P 3:4(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
[??]

The problem is that if you delay the ack until you'll have 2*MSS data
queued, then you'll stall in [??] for `ato' time (that will be as best a
10msec delay at such point). You'll stall because if the sender is not
buggy, the sender will block in [??] because cwnd at such point is only
equal to 2. Without my patch linux-receiver won't send an ack because it
only had 2byte queued and not more than 2*MSS data. But RFC1122 clearly
state that the receiver must send an ack at the _second_ packet, not after
2*MSS unacked data is been queued.

The sender has nothing to do with the issue. The sender is not linux but
is a random OS that only knows about the slow start algorithm and follows
rfc2001.txt word-by-word.

I don't think my patch is the wrong thing to do at least by reading
RFC1122. If rfc1122 is been obsoleted I would be glad if you would give me
the url of a new spec. Thanks.

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/