Re: [PATCH] tcp: Add an extra check for consecutive failed keepalive probes

From: Neal Cardwell
Date: Thu Jan 09 2025 - 11:32:28 EST


On Thu, Jan 9, 2025 at 11:21 AM Lizhe <sensor1010@xxxxxxx> wrote:
>
> Add an additional check to handle situations where consecutive
> keepalive probe packets are sent without receiving a response.
>
> Signed-off-by: Lizhe <sensor1010@xxxxxxx>
> ---
> net/ipv4/tcp_timer.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index b412ed88ccd9..5a5dee8cd6d3 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -828,6 +828,12 @@ static void tcp_keepalive_timer (struct timer_list *t)
> }
> if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
> icsk->icsk_probes_out++;
> + if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
> + tcp_send_active_reset(sk, GFP_ATOMIC,
> + SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
> + tcp_write_err(sk);
> + goto out;
> + }
> elapsed = keepalive_intvl_when(tp);
> } else {
> /* If keepalive was lost due to local congestion,
> --

Can you please explain the exact motivation for your patch, ideally
providing either a tcpdump trace or packetdrill test to document the
scenario you are concerned about?

The Linux TCP keepalive logic in tcp_keepalive_timer() already
includes logic (a few lines above the spot you propose to patch) that
ensures that a connection will be closed with ETIMEDOUT if consecutive
keepalive probes fail:

if ((user_timeout != 0 &&
elapsed >= msecs_to_jiffies(user_timeout) &&
icsk->icsk_probes_out > 0) ||
(user_timeout == 0 &&
icsk->icsk_probes_out >= keepalive_probes(tp))) {
tcp_send_active_reset(sk, GFP_ATOMIC,

SK_RST_REASON_TCP_KEEPALIVE_TIMEOUT);
tcp_write_err(sk);
goto out;
}
if (tcp_write_wakeup(sk, LINUX_MIB_TCPKEEPALIVE) <= 0) {
icsk->icsk_probes_out++;
elapsed = keepalive_intvl_when(tp);

AFAICT your patch nearly duplicates the existing logic, but changes
the application-visible behavior to close the connection after one
fewer timer expiration, thus breaking the semantics of the
net.ipv4.tcp_keepalive_probes.

neal

---

ps: For reference, here is a packetdrill test we use to test this
functionality; this passes on recent Linux kernels:

// Test TCP keepalive behavior without TCP timestamps enabled.

`../common/defaults.sh`

// Create a socket.
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

// Establish a connection.
+0 < S 0:0(0) win 20000 <mss 1000,nop,nop,sackOK,nop,wscale 8>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
+.1 < . 1:1(0) ack 1 win 20000
+0 accept(3, ..., ...) = 4

// Verify keepalives are disabled by default.
+0 getsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [0], [4]) = 0
// Enable keepalives:
+0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0

// Verify default TCP_KEEPIDLE is 7200, from net.ipv4.tcp_keepalive_time=7200:
+0 getsockopt(4, SOL_TCP, TCP_KEEPIDLE, [7200], [4]) = 0
// Start sending keepalive probes after 3 seconds of idle
+0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [3], 4) = 0

// Verify default TCP_KEEPINTVL is 75, from net.ipv4.tcp_keepalive_intvl=75:
+0 getsockopt(4, SOL_TCP, TCP_KEEPINTVL, [75], [4]) = 0
// Send keepalive probes every 2 seconds.
+0 setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [2], 4) = 0

// Verify default TCP_KEEPCNT is 9, from net.ipv4.tcp_keepalive_probes=9:
+0 getsockopt(4, SOL_TCP, TCP_KEEPCNT, [9], [4]) = 0
// Send 4 keepalive probes before giving up.
+0 setsockopt(4, SOL_TCP, TCP_KEEPCNT, [4], 4) = 0

// Set up an epoll operation to verify that connections terminated by failed
// keepalives will wake up blocked epoll waiters with EPOLLERR|EPOLLHUP:
+0 epoll_create(1) = 5
+0 epoll_ctl(5, EPOLL_CTL_ADD, 4, {events=EPOLLERR, fd=4}) = 0
+0...11 epoll_wait(5, {events=EPOLLERR|EPOLLHUP, fd=4}, 1, 15000) = 1

// Verify keepalive behavior looks correct, given the parameters above:

// Start sending keepalive probes after 3 seconds of idle.
+3 > . 0:0(0) ack 1
// Send keepalive probes every 2 seconds.
+2 > . 0:0(0) ack 1
+2 > . 0:0(0) ack 1
+2 > . 0:0(0) ack 1
+2 > R. 1:1(0) ack 1
// Sent 4 keepalive probes and then gave up and reset the connection.

// Verify that we get the expected error when we try to use the socket:
+0 read(4, ..., 1000) = -1 ETIMEDOUT (Connection timed out)