Re: [patch] TCP/IP delacks disabled with MPI

Andi Kleen (ak@muc.de)
Mon, 24 May 1999 15:25:11 +0200


On Mon, May 24, 1999 at 02:33:58PM +0200, Andrea Arcangeli wrote:
> On Mon, 24 May 1999, Andi Kleen wrote:
> >tp->packets_out / tcp_packets_in_flight). The send queue check is far
> >too costly.

Yes, sorry the tcp_packets_in_flight thing was a brain fart.

I still don't like your send queue check.

> >But the trace shows that the problem is that sender doesn't get out of
> >slow start quickly enough because of the delacks. Now we already have
>
> No. The performance drop due breaking RFC1122 has nothing to do with the
> sender. Maybe also the sender should incrase cwnd by more than 1 but
> that's a different issue (and reading RFC it seems that the sender is ok
> by increasing cwnd only by 1).
>
> >a mechanism in 2.2 to work around that - quick ack mode. It gets enabled
>
> quick ack mode is not a mechanizm to work around that. It's just that when
> we got an OOO segment we want to ack immediatly the received packet to
> update the sender about our current rcv_nxt.

Erm no, that it required (ack a OOO packet immediately), quick ack is a
heuristic to get the sender quickly out of slow start by not delaying acks.
It would be trivial to extend it to more than one packet after slow start
begin.

> >Currently it only acks quickly a single packet (I believe that was been
> >done to optimize HTTP mainly), but that could be extended without too
>
> quickack is not only to ack fast the first packet after the three way
> handshake.

???

>
> >So I believe the right fix is to add a new sysctl that enlarges quick ack
> >mode to more packets.
>
> Could you show me how this trace will continue with your fix applyed?
>
> 18:49:12.911613 localhost.1155 > localhost.3333: S 3745394879:3745394879(0) win 31072 <mss 3884,sackOK,timestamp 1616487 0,nop,wscale 0> (DF)
> 18:49:12.911708 localhost.3333 > localhost.1155: S 3746598727:3746598727(0) ack 3745394880 win 31072 <mss 3884,sackOK,timestamp 1616487 1616487,nop,wscale 0> (DF)
> 18:49:12.911751 localhost.1155 > localhost.3333: . ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
> 18:49:12.911839 localhost.1155 > localhost.3333: P 1:2(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
> 18:49:12.911897 localhost.3333 > localhost.1155: . ack 2 win 31071 <nop,nop,timestamp 1616487 1616487> (DF)
> 18:49:12.911934 localhost.1155 > localhost.3333: P 2:3(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
> 18:49:12.911966 localhost.1155 > localhost.3333: P 3:4(1) ack 1 win 31072 <nop,nop,timestamp 1616487 1616487> (DF)
> [??]
>
> The problem is that if you delay the ack until you'll have 2*MSS data
> queued, then you'll stall in [??] for `ato' time (that will be as best a
> 10msec delay at such point). You'll stall because if the sender is not
> buggy, the sender will block in [??] because cwnd at such point is only
> equal to 2. Without my patch linux-receiver won't send an ack because it
> only had 2byte queued and not more than 2*MSS data. But RFC1122 clearly
> state that the receiver must send an ack at the _second_ packet, not after
> 2*MSS unacked data is been queued.

With extended quick ack the ACK is not delayed for N (N is a sysctl; according
to David Borman 8 is a good N) packets after slow start. So it would behave
the same as your in this situation.

> I don't think my patch is the wrong thing to do at least by reading
> RFC1122. If rfc1122 is been obsoleted I would be glad if you would give me
> the url of a new spec. Thanks.

Not wrong, but I don't think it is the best way to fix the issue. Currently
the ack send decision check in Linux 2.2 has been grown to be very complex.
This is a bad thing. Your patch makes it much more complex.

Quick ack mode is a existing mechanism that could be extended to fix it
better.

For an overview of current working IETF papers on TCP see
http://www.ietf.org/html.charters/tcpimpl-charter.html, for acks see
RFC2581. The problems we have are quite clearly explained in 4.2 there,
they suggest simply to force an ACK after every two segments. I don't like
that, I would prefer to first try to fix it with quick ack mode.

The problem in Linux is clearly there, the question is only what the best
fix is.

-Andi

-- 
This is like TV. I don't like TV.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/