Re: [syzbot] [net?] possible deadlock in team_device_event (3)

From: syzbot
Date: Tue Jul 30 2024 - 07:48:06 EST


Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in team_device_event

netlink: 'syz.0.15': attribute type 10 has an invalid length.
dummy0: left promiscuous mode
dummy0: entered promiscuous mode
============================================
WARNING: possible recursive locking detected
6.11.0-rc1-syzkaller-g94ede2a3e913 #0 Not tainted
--------------------------------------------
syz.0.15/5896 is trying to acquire lock:
ffff88801e9d4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_port_change_check drivers/net/team/team_core.c:2950 [inline]
ffff88801e9d4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973

but task is already holding lock:
ffff88801e9d4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(team->team_lock_key#2);
lock(team->team_lock_key#2);

*** DEADLOCK ***

May be due to missing lock nesting notation

2 locks held by syz.0.15/5896:
#0: ffffffff8fa1e9a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
#0: ffffffff8fa1e9a8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x372/0xea0 net/core/rtnetlink.c:6644
#1: ffff88801e9d4d40 (team->team_lock_key#2){+.+.}-{3:3}, at: team_add_slave+0x9c/0x20e0 drivers/net/team/team_core.c:1975

stack backtrace:
CPU: 1 UID: 0 PID: 5896 Comm: syz.0.15 Not tainted 6.11.0-rc1-syzkaller-g94ede2a3e913 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:119
check_deadlock kernel/locking/lockdep.c:3061 [inline]
validate_chain kernel/locking/lockdep.c:3855 [inline]
__lock_acquire+0x2167/0x3cb0 kernel/locking/lockdep.c:5142
lock_acquire kernel/locking/lockdep.c:5759 [inline]
lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5724
__mutex_lock_common kernel/locking/mutex.c:608 [inline]
__mutex_lock+0x175/0x9c0 kernel/locking/mutex.c:752
team_port_change_check drivers/net/team/team_core.c:2950 [inline]
team_device_event+0x2c7/0x770 drivers/net/team/team_core.c:2973
notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
call_netdevice_notifiers net/core/dev.c:2046 [inline]
__dev_notify_flags+0x12d/0x2e0 net/core/dev.c:8876
dev_change_flags+0x10c/0x160 net/core/dev.c:8914
vlan_device_event+0xdfc/0x2120 net/8021q/vlan.c:468
notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1994
call_netdevice_notifiers_extack net/core/dev.c:2032 [inline]
call_netdevice_notifiers net/core/dev.c:2046 [inline]
dev_open net/core/dev.c:1515 [inline]
dev_open+0x144/0x160 net/core/dev.c:1503
team_port_add drivers/net/team/team_core.c:1216 [inline]
team_add_slave+0xacd/0x20e0 drivers/net/team/team_core.c:1976
do_set_master+0x1bc/0x230 net/core/rtnetlink.c:2701
do_setlink+0xcaf/0x3ff0 net/core/rtnetlink.c:2907
__rtnl_newlink+0xc35/0x1960 net/core/rtnetlink.c:3696
rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3743
rtnetlink_rcv_msg+0x3c7/0xea0 net/core/rtnetlink.c:6647
netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2550
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x544/0x830 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0xab5/0xc90 net/socket.c:2597
___sys_sendmsg+0x135/0x1e0 net/socket.c:2651
__sys_sendmsg+0x117/0x1f0 net/socket.c:2680
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f16bd377299
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f16be089048 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f16bd506058 RCX: 00007f16bd377299
RDX: 0000000000000000 RSI: 0000000020000600 RDI: 0000000000000012
RBP: 00007f16bd3e48e6 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f16bd506058 R15: 00007ffebae71978
</TASK>


Tested on:

commit: 94ede2a3 profiling: remove stale percpu flip buffer va..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10b92c6d980000
kernel config: https://syzkaller.appspot.com/x/.config?x=7c04fc17f2c61c03
dashboard link: https://syzkaller.appspot.com/bug?extid=b668da2bc4cb9670bf58
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.