Re: [syzbot] Re: [syzbot] [bpf?] [net?] WARNING in sock_map_destroy

From: syzbot
Date: Sun Sep 29 2024 - 03:43:07 EST


For archival purposes, forwarding an incoming command email to
linux-kernel@xxxxxxxxxxxxxxx, syzkaller-bugs@xxxxxxxxxxxxxxxx.

***

Subject: Re: [syzbot] [bpf?] [net?] WARNING in sock_map_destroy
Author: dmantipov@xxxxxxxxx

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git d505d3593b52b6c43507f119572409087416ba28

From 12b932f214f0caaaa79f929ab81b46ebea125a06 Mon Sep 17 00:00:00 2001
From: Dmitry Antipov <dmantipov@xxxxxxxxx>
Date: Tue, 10 Sep 2024 14:21:20 +0300
Subject: [PATCH net v2] net: sockmap: avoid race between sock_map_destroy() and
sk_psock_put()

Syzbot has triggered the following race condition:

On CPU0, 'sk_psock_drop()' (most likely scheduled from 'sock_map_unref()'
called by 'sock_map_update_common()') is running at [1]:

void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
{
write_lock_bh(&sk->sk_callback_lock);
sk_psock_restore_proto(sk, psock); [1]
rcu_assign_sk_user_data(sk, NULL); [2]
...
}

If 'sock_map_destroy()' is scheduled on CPU1 at the same time, psock is
always NULL at [3]. But, since [1] may be is in progress during [4], the
value of 'saved_destroy' at this point is undefined:

void sock_map_destroy(struct sock *sk)
{
void (*saved_destroy)(struct sock *sk);
struct sk_psock *psock;

rcu_read_lock();
psock = sk_psock_get(sk); [3]
if (unlikely(!psock)) {
rcu_read_unlock();
saved_destroy = READ_ONCE(sk->sk_prot)->destroy; [4]
} else {
saved_destroy = psock->saved_destroy; [5]
sock_map_remove_links(sk, psock);
rcu_read_unlock();
sk_psock_stop(psock);
sk_psock_put(sk, psock);
}
if (WARN_ON_ONCE(saved_destroy == sock_map_destroy))
return;
if (saved_destroy)
saved_destroy(sk);
}

Fix this issue in 3 steps:

1. Prefer 'sk_psock()' over 'sk_psock_get()' at [3]. Since zero
refcount is ignored, 'psock' is non-NULL until [2] is completed.

2. Add read lock around [5], to make sure that [1] is not in progress
when the former is executed.

3. Since 'sk_psock()' does not adjust reference counting, drop
'sk_psock_put()' and redundant 'sk_psock_stop()' (which is
executed by 'sk_psock_drop()' anyway).

Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself")
Reported-by: syzbot+f363afac6b0ace576f45@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=f363afac6b0ace576f45
Signed-off-by: Dmitry Antipov <dmantipov@xxxxxxxxx>
---
net/core/sock_map.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index d3dbb92153f2..1eeb1d3a6b71 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1649,16 +1649,16 @@ void sock_map_destroy(struct sock *sk)
struct sk_psock *psock;

rcu_read_lock();
- psock = sk_psock_get(sk);
+ psock = sk_psock(sk);
if (unlikely(!psock)) {
rcu_read_unlock();
saved_destroy = READ_ONCE(sk->sk_prot)->destroy;
} else {
+ read_lock_bh(&sk->sk_callback_lock);
saved_destroy = psock->saved_destroy;
+ read_unlock_bh(&sk->sk_callback_lock);
sock_map_remove_links(sk, psock);
rcu_read_unlock();
- sk_psock_stop(psock);
- sk_psock_put(sk, psock);
}
if (WARN_ON_ONCE(saved_destroy == sock_map_destroy))
return;
--
2.46.0