Pre-2.0.31-2 blocked-xmit network bug, `tcpdump` enclosed

Kristofer T. Karas (ktk@ktk.bidmc.harvard.edu)
Mon, 4 Aug 1997 15:02:47 -0400


I have observed what I believe is a bug in pre-2.0.31-2 which causes a
blocked transmitter in Linux (due to window size less than 1/2 MSS) to
never restart if the grown-window ACK from the other end is lost in
transit (or otherwise not received).

The following tcpdump trace shows my linux box (ktk) attempting to
deliver email (rfc821 and friends) to a NT box's cc:mail server (west).
It's a standard RFC821/1023 exchange, which I have
annotated with comments following each associated line. After west
354's the DATA command, `ktk' never sends any data; 889 bytes of data
are stuck in the transmit queue (according to `netstat`), awaiting
west's window size (880) to grow. After two minutes elapse where no
packets travel in either direction, west times out (the last two lines
of the dump).

If I understand the protocol correctly, Linux (`ktk') should
periodically send a packet to west, in case west's window size has
opened up (compensating for any missed ACKs); only if the window size
drops to 0 does the burden shift to west to send periodic acks back to
ktk. Correct?

Note that: west's window size is 1024, yet it advertises MSS of 1962.
Also, note that west's window size drops with each piece of data
received, never increasing as the smtp server reads the data. Yet, I
still think that the fault is Linux's (the `ktk' box), despite the
weird behaviour from west.

13:57:11.516626 ktk.21266 > west.smtp: S 983522210:983522210(0) win 512 <mss 1460>
13:57:11.556624 west.smtp > ktk.21266: S 117255994:117255994(0) ack 983522211 win 1024 <mss 1962>
13:57:11.556624 ktk.21266 > west.smtp: . ack 1 win 15360 (DF)
13:57:12.716578 west.smtp > ktk.21266: P 1:43(42) ack 1 win 1024 (DF)
"220 West..."
13:57:12.726577 ktk.21266 > west.smtp: P 1:29(28) ack 43 win 15360 (DF)
"EHLO ktk.bidmc.harvard.edu"
13:57:13.086563 west.smtp > ktk.21266: . ack 29 win 996
13:57:14.466507 west.smtp > ktk.21266: P 43:69(26) ack 29 win 996
"500 command unrecognized."
13:57:14.466507 ktk.21266 > west.smtp: P 29:57(28) ack 69 win 15360 (DF)
"HELO ktk..."
13:57:14.796494 west.smtp > ktk.21266: . ack 57 win 968
13:57:16.156439 west.smtp > ktk.21266: P 69:95(26) ack 57 win 968
"250 welcome, ktk"
13:57:16.166439 ktk.21266 > west.smtp: P 57:96(39) ack 95 win 15360 (DF)
"MAIL FROM:<...>"
13:57:16.436428 west.smtp > ktk.21266: . ack 96 win 929
13:57:18.196357 west.smtp > ktk.21266: P 95:111(16) ack 96 win 929
"250 <...> OK."
13:57:18.196357 ktk.21266 > west.smtp: P 96:139(43) ack 111 win 15360 (DF)
"RCPT TO:<...>"
13:57:18.466346 west.smtp > ktk.21266: . ack 139 win 886
13:57:18.766334 west.smtp > ktk.21266: P 111:130(19) ack 139 win 886
"250 <...> OK"
13:57:18.766334 ktk.21266 > west.smtp: P 139:145(6) ack 130 win 15360 (DF)
"DATA"
13:57:19.006324 west.smtp > ktk.21266: P 130:177(47) ack 145 win 880 (DF)
"354 Start data; end with CRLF.CRLF"
13:57:19.026323 ktk.21266 > west.smtp: . ack 177 win 15360 (DF)
Here, the `ktk' box waits with 889 bytes in transmit queue for `west's
window to grow beyond 880, which it never does.
13:59:19.851453 west.smtp > ktk.21266: P 177:206(29) ack 145 win 880 (DF)
13:59:19.851453 west.smtp > ktk.21266: RP 177:206(29) ack 145 win 880 (DF)
"554 Data not received..."

--
Kristofer Karas - SysAdmin, BI Deaconness Med. Ctr. - ktk@ktk.bidmc.harvard.edu
AMA/CCS, DoD, RF900RR, HawkGT, !car     ***  http://ktk.bidmc.harvard.edu/~ktk/
"Health nuts are going to feel stupid someday, lying in hospitals dying
 of nothing."  -- Redd Foxx             ***  Will design LISP machines for food