Re: [PATCH net 1/2] tcp: call sk_data_ready() after listener migration
From: Eric Dumazet
Date: Sat Apr 18 2026 - 02:02:46 EST
On Fri, Apr 17, 2026 at 9:17 PM Zhenzhong Wu <jt26wzz@xxxxxxxxx> wrote:
>
> When inet_csk_listen_stop() migrates an established child socket from
> a closing listener to another socket in the same SO_REUSEPORT group,
> the target listener gets a new accept-queue entry via
> inet_csk_reqsk_queue_add(), but that path never notifies the target
> listener's waiters.
>
> As a result, a nonblocking accept() still succeeds because it checks
> the accept queue directly, but waiters that sleep for listener
> readiness can remain asleep until another connection generates a
> wakeup. This affects poll()/epoll_wait()-based waiters, and can also
> leave a blocking accept() asleep after migration even though the
> child is already in the target listener's accept queue.
>
> This was observed in a local test where listener A completed the
> handshake, queued the child, and was closed before userspace called
> accept(). The child was migrated to listener B, but listener B never
> received a wakeup for the migrated accept-queue entry.
>
> Call READ_ONCE(nsk->sk_data_ready)(nsk) after a successful migration
> in inet_csk_listen_stop().
>
> The reqsk_timer_handler() path does not need the same change:
> half-open requests only become readable to userspace when the final
> ACK completes the handshake, and tcp_child_process() already wakes
> the listener in that case.
>
> Fixes: 54b92e841937 ("tcp: Migrate TCP_ESTABLISHED/TCP_SYN_RECV sockets in accept queues.")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Zhenzhong Wu <jt26wzz@xxxxxxxxx>
> ---
> net/ipv4/inet_connection_sock.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 4ac3ae1bc..da1ce082f 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -1483,6 +1483,7 @@ void inet_csk_listen_stop(struct sock *sk)
> __NET_INC_STATS(sock_net(nsk),
> LINUX_MIB_TCPMIGRATEREQSUCCESS);
> reqsk_migrate_reset(req);
> + READ_ONCE(nsk->sk_data_ready)(nsk);
I think this is adding a potential UAF (Use Afte Free).
@nsk might have been freed already by another thread/cpu.
Note the existing code already has similar issues.
Untested patch:
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 4ac3ae1bc1afc3a39f2790e39b4dda877dc3272b..287b6e01c4f71bfec3dd2a708f316224d9eb4a64
100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -1479,6 +1479,7 @@ void inet_csk_listen_stop(struct sock *sk)
if (nreq) {
refcount_set(&nreq->rsk_refcnt, 1);
+ rcu_read_lock();
if (inet_csk_reqsk_queue_add(nsk,
nreq, child)) {
__NET_INC_STATS(sock_net(nsk),
LINUX_MIB_TCPMIGRATEREQSUCCESS);
@@ -1489,7 +1490,7 @@ void inet_csk_listen_stop(struct sock *sk)
reqsk_migrate_reset(nreq);
__reqsk_free(nreq);
}
-
+ rcu_read_unlock();
/* inet_csk_reqsk_queue_add() has already
* called inet_child_forget() on failure case.
*/