Re: [syzbot] [rds?] possible deadlock in rds_tcp_tune (2)

From: syzbot

Date: Thu Feb 26 2026 - 16:22:18 EST


> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git main

This crash does not have a reproducer. I cannot test it.

>
> commit 4cd6716706210de3ed52d549ee784a12cc8ffe3a (HEAD)
> Author: Allison Henderson <achender@xxxxxxxxxx>
> Date: Thu Feb 26 12:45:39 2026 -0700
>
> net/rds: Fix circular locking dependency in rds_tcp_tune
>
> syzbot reported a circular locking dependency in rds_tcp_tune() where
> sk_net_refcnt_upgrade() is called while holding the socket lock:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u10:8/15040 is trying to acquire lock:
> ffffffff8e9aaf80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x4b/0x6f0
>
> but task is already holding lock:
> ffff88805a3c1ce0 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: rds_tcp_tune+0xd7/0x930
>
> The issue occurs because sk_net_refcnt_upgrade() performs memory allocation
> (via get_net_track() -> ref_tracker_alloc()) while the socket lock is held,
> creating a circular dependency with fs_reclaim.
>
> Fix this by moving sk_net_refcnt_upgrade() outside the socket lock critical
> section. Since the fresh socket is not yet exposed to other threads, no
> locks are needed at this time.
>
> Reported-by: syzbot+2e2cf5331207053b8106@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=2e2cf5331207053b8106
> Fixes: 5c70eb5c593d ("net: better track kernel sockets lifetime")
> Signed-off-by: Allison Henderson <achender@xxxxxxxxxx>
>
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 04f310255692..da22b3dfdbf0 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
> @@ -490,18 +490,24 @@ bool rds_tcp_tune(struct socket *sock)
> commit 4cd6716706210de3ed52d549ee784a12cc8ffe3a (HEAD)
> Author: Allison Henderson <achender@xxxxxxxxxx>
> Date: Thu Feb 26 12:45:39 2026 -0700
>
> net/rds: Fix circular locking dependency in rds_tcp_tune
>
> syzbot reported a circular locking dependency in rds_tcp_tune() where
> sk_net_refcnt_upgrade() is called while holding the socket lock:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u10:8/15040 is trying to acquire lock:
> ffffffff8e9aaf80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc_cache_noprof+0x4b/0x6f0
>
> but task is already holding lock:
> ffff88805a3c1ce0 (k-sk_lock-AF_INET6){+.+.}-{0:0}, at: rds_tcp_tune+0xd7/0x930
>
> The issue occurs because sk_net_refcnt_upgrade() performs memory allocation
> (via get_net_track() -> ref_tracker_alloc()) while the socket lock is held,
> creating a circular dependency with fs_reclaim.
>
> Fix this by moving sk_net_refcnt_upgrade() outside the socket lock critical
> section. Since the fresh socket is not yet exposed to other threads, no
> locks are needed at this time.
>
> Reported-by: syzbot+2e2cf5331207053b8106@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=2e2cf5331207053b8106
> Fixes: 5c70eb5c593d ("net: better track kernel sockets lifetime")
> Signed-off-by: Allison Henderson <achender@xxxxxxxxxx>
>
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 04f310255692..da22b3dfdbf0 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
> @@ -490,18 +490,24 @@ bool rds_tcp_tune(struct socket *sock)
> struct rds_tcp_net *rtn;
>
> tcp_sock_set_nodelay(sock->sk);
> - lock_sock(sk);
> /* TCP timer functions might access net namespace even after
> * a process which created this net namespace terminated.
> */
> if (!sk->sk_net_refcnt) {
> - if (!maybe_get_net(net)) {
> - release_sock(sk);
> + if (!maybe_get_net(net))
> return false;
> - }
> + /*
> + * We call sk_net_refcnt_upgrade before the lock_sock since it is
> + * not yet shared, no lock is needed at this time. Further,
> + * because sk_net_refcnt_upgrade does a GFP_KERNEL allocation,
> + * this can trigger an fs_reclaim in other systems which creates
> + * a circular lock dependancy. Avoid this by upgrading the
> + * refcnt before the locking the socket.
> + */
> sk_net_refcnt_upgrade(sk);
> put_net(net);
> }
> + lock_sock(sk);
> rtn = net_generic(net, rds_tcp_netid);
> if (rtn->sndbuf_size > 0) {
> sk->sk_sndbuf = rtn->sndbuf_size;
> @@ -490,18 +490,24 @@ bool rds_tcp_tune(struct socket *sock)
> struct rds_tcp_net *rtn;
>
> tcp_sock_set_nodelay(sock->sk);
> - lock_sock(sk);
> /* TCP timer functions might access net namespace even after
> * a process which created this net namespace terminated.
> */
> if (!sk->sk_net_refcnt) {
> - if (!maybe_get_net(net)) {
> - release_sock(sk);
> + if (!maybe_get_net(net))
> return false;
> - }
> + /*
> + * We call sk_net_refcnt_upgrade before the lock_sock since it is
> + * not yet shared, no lock is needed at this time. Further,
> + * because sk_net_refcnt_upgrade does a GFP_KERNEL allocation,
> + * this can trigger an fs_reclaim in other systems which creates
> + * a circular lock dependancy. Avoid this by upgrading the
> + * refcnt before the locking the socket.
> + */
> sk_net_refcnt_upgrade(sk);
> put_net(net);
> }
> + lock_sock(sk);
> rtn = net_generic(net, rds_tcp_netid);
> if (rtn->sndbuf_size > 0) {
> sk->sk_sndbuf = rtn->sndbuf_size;
>