RE: TCP keepalive timer problem

From: Li_Xin2
Date: Tue Aug 25 2009 - 10:06:17 EST



Thanks for your quick reply, let me explain my problem in detail.

Suppose the client side of communication sets the keep alive socket option, connects to server, then we pulls out the network cable of server box. After the connection is idle for TCP_KEEPIDLE seconds, the first keepalive probe packet is sent, and of course no reply is received. Just after the first probe packet, the client sends some data. No response is received, and as you said, the normal retransmission takes place and no further keepalive probe will be sent.

The problem is: application that tries the keepalive mechanism expects communication peer crash detection within TCP_KEEPIDLE + TCP_KEEPCNT * TCP_KEEPINTVL seconds. Application may set relative smaller TCP_KEEPIDLE, TCP_KEEPCNT and TCP_KEEPINTVL value so that peer crash can be detected quickly, for example, 60 seconds. But if the keepalive is intervened with retransmission, the latter takes higher priority, so that peer crash will be detected after 13 to 30 minutes, which may not be acceptable for some applications.

We tried TCP implementation on Windows XP SP3, the keepalive and retransmission don't intervene.

Regards,
Xin Li
EMC Shanghai R&D Centre
Email: Li_Xin2@xxxxxxx
Tel: 86 21 6095 1100 x 2257

-----Original Message-----
From: Eric Dumazet [mailto:eric.dumazet@xxxxxxxxx]
Sent: 2009年8月25日 21:13
To: Li, Xin
Cc: linux-kernel@xxxxxxxxxxxxxxx; Linux Netdev List
Subject: Re: TCP keepalive timer problem

Li_Xin2@xxxxxxx a écrit :
> Greetings,
>
> I found one problem in Linux TCP keepalive timer processing, after
> searching on google, I found Daniel Stempel reported the same problem in
> 2007 (http://lkml.indiana.edu/hypermail/linux/kernel/0702.2/1136.html),
> but got no answer. So I have to reraise it.
>
> Can anyone help answer this two-years long question?
>
>

You should explain your problem in detail, since Daniel one was probably different.

He mentioned "(timeout is set to e.g. 30 seconds)" which is kind of nasty, given normal one is 7200

If some packets are in flight, keepalive is not fired at all, since normal
retransmits should take place (check tcp_retries2 sysctl).

TCP Keepalive is only fired when no trafic occurred for a long time, only if
SO_KEEPALIVE socket option was enabled by application.

tcp_retries2 (integer; default: 15)
The maximum number of times a TCP packet is retransmitted in established state
before giving up. The default value is 15, which corresponds to a duration of
approximately between 13 to 30 minutes, depending on the retransmission timeout.
The RFC 1122 specified minimum limit of 100 seconds is typically deemed too short.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/