Re: [syzbot] [net?] possible deadlock in rtnl_lock (8)

From: syzbot
Date: Wed Sep 11 2024 - 08:07:16 EST


Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in rtnl_lock

======================================================
WARNING: possible circular locking dependency detected
6.11.0-rc7-syzkaller-g7e3e2c7f05cd-dirty #0 Not tainted
------------------------------------------------------
syz.0.15/7317 is trying to acquire lock:
ffff8000923b7ea8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79

but task is already holding lock:
ffff0000d4798a58 (&smc->clcsock_release_lock){+.+.}-{3:3}, at: smc_setsockopt+0x178/0x10fc net/smc/af_smc.c:3064

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&smc->clcsock_release_lock){+.+.}-{3:3}:
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
smc_switch_to_fallback+0x48/0xa80 net/smc/af_smc.c:902
smc_sendmsg+0xfc/0x9f8 net/smc/af_smc.c:2779
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
__sys_sendto+0x374/0x4f4 net/socket.c:2204
__do_sys_sendto net/socket.c:2216 [inline]
__se_sys_sendto net/socket.c:2212 [inline]
__arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

-> #1 (sk_lock-AF_INET){+.+.}-{0:0}:
lock_sock_nested net/core/sock.c:3543 [inline]
lock_sock include/net/sock.h:1607 [inline]
sockopt_lock_sock+0x88/0x148 net/core/sock.c:1061
do_ip_setsockopt+0x1438/0x346c net/ipv4/ip_sockglue.c:1078
ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
raw_setsockopt+0x100/0x294 net/ipv4/raw.c:845
sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
__sys_setsockopt+0x128/0x1a8 net/socket.c:2347
__do_sys_setsockopt net/socket.c:2356 [inline]
__se_sys_setsockopt net/socket.c:2353 [inline]
__arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

-> #0 (rtnl_mutex){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79
do_ip_setsockopt+0xe8c/0x346c net/ipv4/ip_sockglue.c:1077
ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
tcp_setsockopt+0xcc/0xe8 net/ipv4/tcp.c:3768
sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
smc_setsockopt+0x204/0x10fc net/smc/af_smc.c:3072
do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
__sys_setsockopt+0x128/0x1a8 net/socket.c:2347
__do_sys_setsockopt net/socket.c:2356 [inline]
__se_sys_setsockopt net/socket.c:2353 [inline]
__arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

other info that might help us debug this:

Chain exists of:
rtnl_mutex --> sk_lock-AF_INET --> &smc->clcsock_release_lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&smc->clcsock_release_lock);
lock(sk_lock-AF_INET);
lock(&smc->clcsock_release_lock);
lock(rtnl_mutex);

*** DEADLOCK ***

1 lock held by syz.0.15/7317:
#0: ffff0000d4798a58 (&smc->clcsock_release_lock){+.+.}-{3:3}, at: smc_setsockopt+0x178/0x10fc net/smc/af_smc.c:3064

stack backtrace:
CPU: 1 UID: 0 PID: 7317 Comm: syz.0.15 Not tainted 6.11.0-rc7-syzkaller-g7e3e2c7f05cd-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
dump_stack+0x1c/0x28 lib/dump_stack.c:128
print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2059
check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2186
check_prev_add kernel/locking/lockdep.c:3133 [inline]
check_prevs_add kernel/locking/lockdep.c:3252 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79
do_ip_setsockopt+0xe8c/0x346c net/ipv4/ip_sockglue.c:1077
ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
tcp_setsockopt+0xcc/0xe8 net/ipv4/tcp.c:3768
sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
smc_setsockopt+0x204/0x10fc net/smc/af_smc.c:3072
do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
__sys_setsockopt+0x128/0x1a8 net/socket.c:2347
__do_sys_setsockopt net/socket.c:2356 [inline]
__se_sys_setsockopt net/socket.c:2353 [inline]
__arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598


Tested on:

commit: 7e3e2c7f Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=13b56100580000
kernel config: https://syzkaller.appspot.com/x/.config?x=921accd5d8340211
dashboard link: https://syzkaller.appspot.com/bug?extid=51cf7cc5f9ffc1006ef2
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
patch: https://syzkaller.appspot.com/x/patch.diff?x=16856100580000