Re: [PATCH net v2] net/smc: fix NULL dereference and UAF in smc_tcp_syn_recv_sock()

From: Eric Dumazet

Date: Mon Mar 09 2026 - 03:36:28 EST


On Mon, Mar 9, 2026 at 8:26 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Mon, Mar 9, 2026 at 3:38 AM Jiayuan Chen <jiayuan.chen@xxxxxxxxx> wrote:
> >
> > Syzkaller reported a panic in smc_tcp_syn_recv_sock() [1].
> >
> > smc_tcp_syn_recv_sock() is called in the TCP receive path
> > (softirq) via icsk_af_ops->syn_recv_sock on the clcsock (TCP
> > listening socket). It reads sk_user_data to get the smc_sock
> > pointer. However, when the SMC listen socket is being closed
> > concurrently, smc_close_active() sets clcsock->sk_user_data
> > to NULL under sk_callback_lock, and then the smc_sock itself
> > can be freed via sock_put() in smc_release().
> >
> > This leads to two issues:
> >
> > 1) NULL pointer dereference: sk_user_data is NULL when
> > accessed.
> > 2) Use-after-free: sk_user_data is read as non-NULL, but the
> > smc_sock is freed before its fields (e.g., queued_smc_hs,
> > ori_af_ops) are accessed.
> >
> > The race window looks like this:
> >
> > CPU A (softirq) CPU B (process ctx)
> >
> > tcp_v4_rcv()
> > TCP_NEW_SYN_RECV:
> > sk = req->rsk_listener
> > sock_hold(sk)
> > /* No lock on listener */
> > smc_close_active():
> > write_lock_bh(cb_lock)
> > sk_user_data = NULL
> > write_unlock_bh(cb_lock)
> > ...
> > smc_clcsock_release()
> > sock_put(smc->sk) x2
> > -> smc_sock freed!
> > tcp_check_req()
> > smc_tcp_syn_recv_sock():
> > smc = user_data(sk)
> > -> NULL or dangling
> > smc->queued_smc_hs
> > -> crash!
> >
> > Note that the clcsock and smc_sock are two independent objects
> > with separate refcounts. TCP stack holds a reference on the
> > clcsock, which keeps it alive, but this does NOT prevent the
> > smc_sock from being freed.
> >
> > Fix this by using RCU and refcount_inc_not_zero() to safely
> > access smc_sock. Since smc_tcp_syn_recv_sock() is called in
> > the TCP three-way handshake path, taking read_lock_bh on
> > sk_callback_lock is too heavy and would not survive a SYN
> > flood attack. Using rcu_read_lock() is much more lightweight.
> >
> > - Set SOCK_RCU_FREE on the SMC listen socket so that
> > smc_sock freeing is deferred until after the RCU grace
> > period. This guarantees the memory is still valid when
> > accessed inside rcu_read_lock().
> > - Use rcu_read_lock() to protect reading sk_user_data.
> > - Use refcount_inc_not_zero(&smc->sk.sk_refcnt) to pin the
> > smc_sock. If the refcount has already reached zero (close
> > path completed), it returns false and we bail out safely.
> >
> > Note: smc_hs_congested() has a similar lockless read of
> > sk_user_data without rcu_read_lock(), but it only checks for
> > NULL and accesses the global smc_hs_wq, never dereferencing
> > any smc_sock field, so it is not affected.
> >
> > Reproducer was verified with mdelay injection and smc_run,
> > the issue no longer occurs with this patch applied.
> >
> > [1] https://syzkaller.appspot.com/bug?extid=827ae2bfb3a3529333e9
> >
> > Fixes: 8270d9c21041 ("net/smc: Limit backlog connections")
> > Reported-by: syzbot+827ae2bfb3a3529333e9@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Closes: https://lore.kernel.org/all/67eaf9b8.050a0220.3c3d88.004a.GAE@xxxxxxxxxx/T/
> > Suggested-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> > ---
> > v2:
> > - Use rcu_read_lock() + refcount_inc_not_zero() instead of
> > read_lock_bh(sk_callback_lock) + sock_hold(), since this
> > is the TCP handshake hot path and read_lock_bh is too
> > expensive under SYN flood.
> > - Set SOCK_RCU_FREE on SMC listen socket to ensure
> > RCU-deferred freeing.
>
> This looks better, but please note there is something missing in your
> RCU implementation.
>
> You still need to ensure s->sk_user_data is read / written with
> load/store tearing prevention.
> Standard rcu_dereference()/rcu_assign() are dealing with this aspect.
>
> For instance, the following helper is assuming a lock was held :
>
> static inline struct smc_sock *smc_clcsock_user_data(const struct sock *clcsk)
> {
> return (struct smc_sock *)
> ((uintptr_t)clcsk->sk_user_data & ~SK_USER_DATA_NOCOPY);
> }
>
>
> One possible way is to use
> rcu_dereference_sk_user_data()/rcu_assign_sk_user_data() or similar
> helpers.
>

More context in this old commit

commit 559835ea7292e2f09304d81eda16f4209433245e
Author: Pravin B Shelar <pshelar@xxxxxxxxxx>
Date: Tue Sep 24 10:25:40 2013 -0700

vxlan: Use RCU apis to access sk_user_data.

Use of RCU api makes vxlan code easier to understand. It also
fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
In rare case without ACCESS_ONCE() compiler might omit vs on
sk_user_data dereference.
Compiler can use vs as alias for sk->sk_user_data, resulting in
multiple sk_user_data dereference in rcu read context which
could change.

CC: Jesse Gross <jesse@xxxxxxxxxx>
Signed-off-by: Pravin B Shelar <pshelar@xxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>