Re: [syzbot] [net?] possible deadlock in unix_del_edges

From: Kuniyuki Iwashima
Date: Thu Apr 04 2024 - 13:02:43 EST


From: syzbot <syzbot+7f7f201cc2668a8fd169@xxxxxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 04 Apr 2024 09:13:26 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 2b3d5988ae2c Add linux-next specific files for 20240404
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13114d8d180000
> kernel config: https://syzkaller.appspot.com/x/.config?x=9c48fd2523cdee5e
> dashboard link: https://syzkaller.appspot.com/bug?extid=7f7f201cc2668a8fd169
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=113c7103180000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1133aaa9180000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/136270ed2c7b/disk-2b3d5988.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/466d2f7c1952/vmlinux-2b3d5988.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7dfaf3959891/bzImage-2b3d5988.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7f7f201cc2668a8fd169@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> ============================================
> WARNING: possible recursive locking detected
> 6.9.0-rc2-next-20240404-syzkaller #0 Not tainted
> --------------------------------------------
> kworker/u8:3/51 is trying to acquire lock:
> ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: unix_del_edges+0x30/0x590 net/unix/garbage.c:227
>
> but task is already holding lock:
> ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0xc5/0x1830 net/unix/garbage.c:547
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(unix_gc_lock);
> lock(unix_gc_lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u8:3/51:
> #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3193 [inline]
> #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3299
> #1: ffffc90000bb7d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3194 [inline]
> #1: ffffc90000bb7d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3299
> #2: ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> #2: ffffffff8f6dc178 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0xc5/0x1830 net/unix/garbage.c:547
> #3: ffff88802bd76118 (rlock-AF_UNIX){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> #3: ffff88802bd76118 (rlock-AF_UNIX){+.+.}-{2:2}, at: unix_collect_skb+0xb8/0x700 net/unix/garbage.c:343
>
> stack backtrace:
> CPU: 0 PID: 51 Comm: kworker/u8:3 Not tainted 6.9.0-rc2-next-20240404-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
> Workqueue: events_unbound __unix_gc
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> check_deadlock kernel/locking/lockdep.c:3062 [inline]
> validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
> spin_lock include/linux/spinlock.h:351 [inline]
> unix_del_edges+0x30/0x590 net/unix/garbage.c:227
> unix_destroy_fpl+0x59/0x210 net/unix/garbage.c:286
> unix_detach_fds net/unix/af_unix.c:1816 [inline]
> unix_destruct_scm+0x13e/0x210 net/unix/af_unix.c:1873
> skb_release_head_state+0x100/0x250 net/core/skbuff.c:1162
> skb_release_all net/core/skbuff.c:1173 [inline]
> __kfree_skb net/core/skbuff.c:1189 [inline]
> kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1225
> kfree_skb include/linux/skbuff.h:1262 [inline]
> unix_collect_skb+0x5e4/0x700 net/unix/garbage.c:361

It seems OOB skb has already lost a refcount but not cleared ?

Will look into the repro today.

Thanks!


> __unix_walk_scc net/unix/garbage.c:481 [inline]
> unix_walk_scc net/unix/garbage.c:506 [inline]
> __unix_gc+0x108c/0x1830 net/unix/garbage.c:559
> process_one_work kernel/workqueue.c:3218 [inline]
> process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
> worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
> kthread+0x2f0/0x390 kernel/kthread.c:388
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
> </TASK>
>
>
> ---
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.