[PATCH net-next] tcp: Improve setsockopt() TCP_USER_TIMEOUT accuracy

From: Jon Maxwell
Date: Mon Jul 02 2018 - 21:17:54 EST


Every time the TCP retransmission timer fires. It checks to see if there is a
timeout before scheduling the next retransmit timer. The retransmit interval
between each retransmission increases exponentially. The issue is that in order
for the timeout to occur the retransmit timer needs to fire again. If the user
timeout check happens after the 9th retransmit for example. It needs to wait for
the 10th retransmit timer to fire in order to evaluate whether a timeout has
occurred or not. If the interval is large enough then the timeout will be
inaccurate.

For example with a TCP_USER_TIMEOUT of 10 seconds without patch:

1st retransmit:

22:25:18.973488 IP host1.49310 > host2.search-agent: Flags [.]

Last retransmit:

22:25:26.205499 IP host1.49310 > host2.search-agent: Flags [.]

Timeout:

send: Connection timed out
Sun Jul 1 22:25:34 EDT 2018

We can see that last retransmit took ~7 seconds. Which pushed the total
timeout to ~15 seconds instead of the expected 10 seconds. This gets more
inaccurate the larger the TCP_USER_TIMEOUT value. As the interval increases.

Fix this by recalculating the last retransmit interval so that it fires when
the timeout should occur. Only implement when icsk->icsk_user_timeout is set.

Test results with the patch is the expected 10 second timeout:

1st retransmit:

01:37:59.022555 IP host1.49310 > host2.search-agent: Flags [.]

Last retransmit:

01:38:06.486558 IP host1.49310 > host2.search-agent: Flags [.]

Timeout:

send: Connection timed out
Mon Jul 2 01:38:09 EDT 2018

Signed-off-by: Jon Maxwell <jmaxwell37@xxxxxxxxx>
---
net/ipv4/tcp_timer.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 3b3611729928..94491a481722 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -407,6 +407,7 @@ void tcp_retransmit_timer(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct inet_connection_sock *icsk = inet_csk(sk);
+ __u32 time_remaining = 0;

if (tp->fastopen_rsk) {
WARN_ON_ONCE(sk->sk_state != TCP_SYN_RECV &&
@@ -535,6 +536,12 @@ void tcp_retransmit_timer(struct sock *sk)
/* Use normal (exponential) backoff */
icsk->icsk_rto = min(icsk->icsk_rto << 1, TCP_RTO_MAX);
}
+ if (icsk->icsk_user_timeout) {
+ time_remaining = jiffies_to_msecs(icsk->icsk_user_timeout) -
+ (tcp_time_stamp(tcp_sk(sk)) - tcp_sk(sk)->retrans_stamp);
+ if (time_remaining < icsk->icsk_rto)
+ icsk->icsk_rto = time_remaining;
+ }
inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, TCP_RTO_MAX);
if (retransmits_timed_out(sk, net->ipv4.sysctl_tcp_retries1 + 1, 0))
__sk_dst_reset(sk);
--
2.13.6