Re: [PATCH net v2] tcp: reset tp->sacked_out when sack is enabled

From: Eric Dumazet
Date: Thu Oct 27 2022 - 07:09:05 EST


On Thu, Oct 27, 2022 at 3:45 AM Lu Wei <luwei32@xxxxxxxxxx> wrote:
>
> If setsockopt with option name of TCP_REPAIR_OPTIONS and opt_code
> of TCPOPT_SACK_PERM is called to enable sack after data is sent
> and before data is acked, it will trigger a warning in function
> tcp_verify_left_out() as follows:
>
> ============================================
> WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132
> tcp_timeout_mark_lost+0x154/0x160
> tcp_enter_loss+0x2b/0x290
> tcp_retransmit_timer+0x50b/0x640
> tcp_write_timer_handler+0x1c8/0x340
> tcp_write_timer+0xe5/0x140
> call_timer_fn+0x3a/0x1b0
> __run_timers.part.0+0x1bf/0x2d0
> run_timer_softirq+0x43/0xb0
> __do_softirq+0xfd/0x373
> __irq_exit_rcu+0xf6/0x140
>
> This warning occurs in several steps:
> Step1. If sack is not enabled, when server receives dup-ack,
> it calls tcp_add_reno_sack() to increase tp->sacked_out.
>
> Step2. Setsockopt() is called to enable sack
>
> Step3. The retransmit timer expires, it calls tcp_timeout_mark_lost()
> to increase tp->lost_out but not clear tp->sacked_out because
> sack is enabled and tcp_is_reno() is false.
>
> So tp->left_out is increased repeatly in Step1 and Step3 and it is
> greater than tp->packets_out and trigger the warning. In function
> tcp_timeout_mark_lost(), tp->sacked_out will be cleared if Step2 not
> happen and the warning will not be triggered. As suggested by Denis
> and Eric, TCP_REPAIR_OPTIONS should be prohibited if data was already
> sent.
>
> socket-tcp tests in CRIU has been tested as follows:
> $ sudo ./test/zdtm.py run -t zdtm/static/socket-tcp* --keep-going \
> --ignore-taint
>
> socket-tcp* represent all socket-tcp tests in test/zdtm/static/.
>
> Fixes: b139ba4e90dc ("tcp: Repair connection-time negotiated parameters")
> Signed-off-by: Lu Wei <luwei32@xxxxxxxxxx>
> ---
> net/ipv4/tcp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index ef14efa1fb70..ef876e70f7fe 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3647,7 +3647,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
> case TCP_REPAIR_OPTIONS:
> if (!tp->repair)
> err = -EINVAL;
> - else if (sk->sk_state == TCP_ESTABLISHED)
> + else if (sk->sk_state == TCP_ESTABLISHED && !tp->packets_out)

You keep focusing on packets_out :/

What I said was : TCP_REPAIR_OPTIONS must be denied if any packets
have been sent (and possibly already ACK)

Looking at tp->packets_out alone is not sufficient.

> err = tcp_repair_options_est(sk, optval, optlen);
> else
> err = -EPERM;
> --
> 2.31.1
>