Re: [syzbot] [ocfs2?] possible deadlock in ocfs2_finish_quota_recovery

From: syzbot
Date: Thu Mar 13 2025 - 09:50:55 EST


syzbot has found a reproducer for the following issue on:

HEAD commit: b7f94fcf5546 Merge tag 'sched_ext-for-6.14-rc6-fixes' of g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=113addb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=efbd4e7089941bb6
dashboard link: https://syzkaller.appspot.com/bug?extid=f59a1ae7b7227c859b8f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1695e698580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12a6e04c580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-b7f94fcf.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/961a37fe09ad/vmlinux-b7f94fcf.xz
kernel image: https://storage.googleapis.com/syzbot-assets/483b4fd1ba55/bzImage-b7f94fcf.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/c0b6b6f0715a/mount_0.gz
fsck result: OK (log: https://syzkaller.appspot.com/x/fsck.log?x=14a6e04c580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f59a1ae7b7227c859b8f@xxxxxxxxxxxxxxxxxxxxxxxxx

ocfs2: Finishing quota recovery on device (7,0) for slot 0
======================================================
WARNING: possible circular locking dependency detected
6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0 Not tainted
------------------------------------------------------
kworker/u4:10/1087 is trying to acquire lock:
ffff88803c49e0e0 (&type->s_umount_key#42){++++}-{4:4}, at: ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603

but task is already holding lock:
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
process_one_work kernel/workqueue.c:3214 [inline]
process_scheduled_works+0x9e4/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

-> #1 ((wq_completion)ocfs2_wq){+.+.}-{0:0}:
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
touch_wq_lockdep_map+0xc7/0x170 kernel/workqueue.c:3907
__flush_workqueue+0x14a/0x1280 kernel/workqueue.c:3949
ocfs2_shutdown_local_alloc+0x109/0xa90 fs/ocfs2/localalloc.c:380
ocfs2_dismount_volume+0x202/0x910 fs/ocfs2/super.c:1822
generic_shutdown_super+0x139/0x2d0 fs/super.c:642
kill_block_super+0x44/0x90 fs/super.c:1710
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1413
task_work_run+0x24f/0x310 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&type->s_umount_key#42){++++}-{4:4}:
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
&type->s_umount_key#42 --> (wq_completion)ocfs2_wq --> (work_completion)(&journal->j_recovery_work)

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock((work_completion)(&journal->j_recovery_work));
lock((wq_completion)ocfs2_wq);
lock((work_completion)(&journal->j_recovery_work));
rlock(&type->s_umount_key#42);

*** DEADLOCK ***

2 locks held by kworker/u4:10/1087:
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3213 [inline]
#0: ffff8880403eb148 ((wq_completion)ocfs2_wq){+.+.}-{0:0}, at: process_scheduled_works+0x98b/0x18e0 kernel/workqueue.c:3319
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3214 [inline]
#1: ffffc900026ffc60 ((work_completion)(&journal->j_recovery_work)){+.+.}-{0:0}, at: process_scheduled_works+0x9c6/0x18e0 kernel/workqueue.c:3319

stack backtrace:
CPU: 0 UID: 0 PID: 1087 Comm: kworker/u4:10 Not tainted 6.14.0-rc6-syzkaller-00022-gb7f94fcf5546 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: ocfs2_wq ocfs2_complete_recovery
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
check_prev_add kernel/locking/lockdep.c:3163 [inline]
check_prevs_add kernel/locking/lockdep.c:3282 [inline]
validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
ocfs2_finish_quota_recovery+0x15c/0x22a0 fs/ocfs2/quota_local.c:603
ocfs2_complete_recovery+0x17c1/0x25c0 fs/ocfs2/journal.c:1357
process_one_work kernel/workqueue.c:3238 [inline]
process_scheduled_works+0xabe/0x18e0 kernel/workqueue.c:3319
worker_thread+0x870/0xd30 kernel/workqueue.c:3400
kthread+0x7a9/0x920 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.