[PATCH 5.10 142/142] tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN

From: Greg Kroah-Hartman
Date: Tue Feb 02 2021 - 09:05:07 EST

Next message: kernel test robot: "Re: [PATCH v12 14/14] powerpc/64s/radix: Enable huge vmalloc mappings"
Previous message: Greg Kroah-Hartman: "[PATCH 5.10 115/142] net/mlx5e: E-switch, Fix rate calculation for overflow"
In reply to: Greg Kroah-Hartman: "[PATCH 5.10 115/142] net/mlx5e: E-switch, Fix rate calculation for overflow"
Next in thread: Pavel Machek: "Re: [PATCH 5.10 000/142] 5.10.13-rc1 review"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Pengcheng Yang <yangpc@xxxxxxxxxx>

commit 62d9f1a6945ba69c125e548e72a36d203b30596e upstream.

Upon receiving a cumulative ACK that changes the congestion state from
Disorder to Open, the TLP timer is not set. If the sender is app-limited,
it can only wait for the RTO timer to expire and retransmit.

The reason for this is that the TLP timer is set before the congestion
state changes in tcp_ack(), so we delay the time point of calling
tcp_set_xmit_timer() until after tcp_fastretrans_alert() returns and
remove the FLAG_SET_XMIT_TIMER from ack_flag when the RACK reorder timer
is set.

This commit has two additional benefits:
1) Make sure to reset RTO according to RFC6298 when receiving ACK, to
avoid spurious RTO caused by RTO timer early expires.
2) Reduce the xmit timer reschedule once per ACK when the RACK reorder
timer is set.

Fixes: df92c8394e6e ("tcp: fix xmit timer to only be reset if data ACKed/SACKed")
Link: https://lore.kernel.org/netdev/1611311242-6675-1-git-send-email-yangpc@xxxxxxxxxx
Signed-off-by: Pengcheng Yang <yangpc@xxxxxxxxxx>
Acked-by: Neal Cardwell <ncardwell@xxxxxxxxxx>
Acked-by: Yuchung Cheng <ycheng@xxxxxxxxxx>
Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
Link: https://lore.kernel.org/r/1611464834-23030-1-git-send-email-yangpc@xxxxxxxxxx
Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
include/net/tcp.h | 2 +-
net/ipv4/tcp_input.c | 10 ++++++----
net/ipv4/tcp_recovery.c | 5 +++--
3 files changed, 10 insertions(+), 7 deletions(-)

--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2066,7 +2066,7 @@ void tcp_mark_skb_lost(struct sock *sk,
void tcp_newreno_mark_lost(struct sock *sk, bool snd_una_advanced);
extern s32 tcp_rack_skb_timeout(struct tcp_sock *tp, struct sk_buff *skb,
u32 reo_wnd);
-extern void tcp_rack_mark_lost(struct sock *sk);
+extern bool tcp_rack_mark_lost(struct sock *sk);
extern void tcp_rack_advance(struct tcp_sock *tp, u8 sacked, u32 end_seq,
u64 xmit_time);
extern void tcp_rack_reo_timeout(struct sock *sk);
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2845,7 +2845,8 @@ static void tcp_identify_packet_loss(str
} else if (tcp_is_rack(sk)) {
u32 prior_retrans = tp->retrans_out;

- tcp_rack_mark_lost(sk);
+ if (tcp_rack_mark_lost(sk))
+ *ack_flag &= ~FLAG_SET_XMIT_TIMER;
if (prior_retrans > tp->retrans_out)
*ack_flag |= FLAG_LOST_RETRANS;
}
@@ -3802,9 +3803,6 @@ static int tcp_ack(struct sock *sk, cons

if (tp->tlp_high_seq)
tcp_process_tlp_ack(sk, ack, flag);
- /* If needed, reset TLP/RTO timer; RACK may later override this. */
- if (flag & FLAG_SET_XMIT_TIMER)
- tcp_set_xmit_timer(sk);

if (tcp_ack_is_dubious(sk, flag)) {
if (!(flag & (FLAG_SND_UNA_ADVANCED | FLAG_NOT_DUP))) {
@@ -3817,6 +3815,10 @@ static int tcp_ack(struct sock *sk, cons
&rexmit);
}

+ /* If needed, reset TLP/RTO timer when RACK doesn't set. */
+ if (flag & FLAG_SET_XMIT_TIMER)
+ tcp_set_xmit_timer(sk);
+
if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
sk_dst_confirm(sk);

--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -96,13 +96,13 @@ static void tcp_rack_detect_loss(struct
}
}

-void tcp_rack_mark_lost(struct sock *sk)
+bool tcp_rack_mark_lost(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
u32 timeout;

if (!tp->rack.advanced)
- return;
+ return false;

/* Reset the advanced flag to avoid unnecessary queue scanning */
tp->rack.advanced = 0;
@@ -112,6 +112,7 @@ void tcp_rack_mark_lost(struct sock *sk)
inet_csk_reset_xmit_timer(sk, ICSK_TIME_REO_TIMEOUT,
timeout, inet_csk(sk)->icsk_rto);
}
+ return !!timeout;
}

/* Record the most recently (re)sent time among the (s)acked packets

Next message: kernel test robot: "Re: [PATCH v12 14/14] powerpc/64s/radix: Enable huge vmalloc mappings"
Previous message: Greg Kroah-Hartman: "[PATCH 5.10 115/142] net/mlx5e: E-switch, Fix rate calculation for overflow"
In reply to: Greg Kroah-Hartman: "[PATCH 5.10 115/142] net/mlx5e: E-switch, Fix rate calculation for overflow"
Next in thread: Pavel Machek: "Re: [PATCH 5.10 000/142] 5.10.13-rc1 review"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]