2.1.x (maybe 2.0.x) TCP Bug: SYN between FINs

Kristofer T. Karas (ktk@ktk.bidmc.harvard.edu)
Tue, 25 Aug 1998 13:57:44 -0400


Hi Alan,

Not sure if you are still working on TCP substrates (please fwd if
not), but I have finally figured out how to reproduce a TCP bug in
2.1.117 that I believe exists as far back as 2.0.x.

If you want to try something while reading the rest of this message,
put some host and a 2.1.117 box on the same bridging group, and on the
other host run two instances of the following:
while [ 1 ]; do hose linux-2-1-117 discard -slave </dev/null ; done
Put a tcpdump on some other machine also on the same bridging group,
and enjoy. A sample output is below.

The timing of 2.1.117 (using the slackware-3.5 inetd's internal
'discard' service) is such that there is a delay between receipt of a
FIN and transmission of the companion FIN. This delay affords a
window of opportunity for a new connection request to be received for
the same port. The bug then: if the SYN for a new connection arrives
between the received FIN for an old connection and the acked FIN back
to the old connection, the acked FIN never goes out, and RST is
retured for all future packets arriving at that local port (regardless
of remote port). The interrupted connection remains in state CLOSE,
and the remote hosts report FIN_WAIT2. Killing the listening process
(e.g. inetd) and reinvoking fixes the problem. The mechanism is
probably a race between the process owning the socket closing it, and
the TCP substrate getting the last FIN out.

I first noticed this in 2.0.x, as my Roxen webserver had several
connections perpetually stuck in state CLOSE. I wondered about it,
but never caught one of them in the act, so never debugged it.
Then I watched this bug appear before my eyes in seconds, while I was
using linux's discard server (on 2.1.117) to aid in an unrelated
matter. With a tcpdump running, I could capture this bug in two or
three seconds at will. So here's some sample output.

Kris
-----------
The first seven lines show one successful transaction to the discard
service; a host opens the connection, closes it without sending
anything, and repeats. During the second FIN/FIN sequence, note the
SYN received from a third host, also for the discard service. At that
point, everything breaks down.

12:08:42.010790 host-a.46404 > linux.discard: S 40189689:40189689(0) win 1023 <mss 768>
12:08:42.010790 linux.discard > host-a.46404: S 3475535809:3475535809(0) ack 40189690 win 32256 <mss 768> (DF)
12:08:42.010790 host-a.46404 > linux.discard: . ack 1 win 1023
12:08:42.010790 host-a.46404 > linux.discard: FP 1:1(0) ack 1 win 1023
12:08:42.010790 linux.discard > host-a.46404: . ack 2 win 32256 (DF)
12:08:43.000751 linux.discard > host-a.46404: F 1:1(0) ack 2 win 32256 (DF)
12:08:43.010750 host-a.46404 > linux.discard: . ack 2 win 1023
12:08:43.010750 host-a.46468 > linux.discard: S 16008153:16008153(0) win 1023 <mss 768>
12:08:43.010750 linux.discard > host-a.46468: S 3482354459:3482354459(0) ack 16008154 win 32256 <mss 768> (DF)
12:08:43.010750 host-a.46468 > linux.discard: . ack 1 win 1023
12:08:43.010750 host-a.46468 > linux.discard: FP 1:1(0) ack 1 win 1023
12:08:43.010750 linux.discard > host-a.46468: . ack 2 win 32256 (DF)
12:08:48.240541 host-b.47555 > linux.discard: S 8318361:8318361(0) win 1023 <mss 768>
12:08:48.240541 linux.discard > host-b.47555: R 0:0(0) ack 8318362 win 0
12:08:58.230140 host-b.47747 > linux.discard: S 9621177:9621177(0) win 1023 <mss 768>
12:08:58.230140 linux.discard > host-b.47747: R 0:0(0) ack 9621178 win 0
12:09:08.239739 host-b.47875 > linux.discard: S 25286745:25286745(0) win 1023 <mss 768>
12:09:08.239739 linux.discard > host-b.47875: R 0:0(0) ack 25286746 win 0
12:09:12.979549 host-a.46468 > linux.discard: R 2:2(0) ack 1 win 1023
12:09:12.979549 host-a.47748 > linux.discard: S 28337241:28337241(0) win 1023 <mss 768>
12:09:12.979549 linux.discard > host-a.47748: R 0:0(0) ack 28337242 win 0
12:09:17.979349 host-a.47812 > linux.discard: S 25509177:25509177(0) win 1023 <mss 768>
12:09:17.979349 linux.discard > host-a.47812: R 0:0(0) ack 25509178 win 0
12:09:18.239338 host-b.48259 > linux.discard: S 55950585:55950585(0) win 1023 <mss 768>
12:09:18.239338 linux.discard > host-b.48259: R 0:0(0) ack 55950586 win 0
12:09:22.979148 host-a.47940 > linux.discard: S 31038201:31038201(0) win 1023 <mss 768>
12:09:22.979148 linux.discard > host-a.47940: R 0:0(0) ack 31038202 win 0

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html