[syzbot] [mm?] possible deadlock in collapse_file

From: syzbot
Date: Sat Mar 04 2023 - 21:04:19 EST


Hello,

syzbot found the following issue on:

HEAD commit: 1716a175592a Add linux-next specific files for 20230301
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1566c97f480000
kernel config: https://syzkaller.appspot.com/x/.config?x=e4da7f0aef5d2eb8
dashboard link: https://syzkaller.appspot.com/bug?extid=534d1c3c0c08473dc853
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f1717f480000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=130f6874c80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0745b94b7a1b/disk-1716a175.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9a0be79f3fd5/vmlinux-1716a175.xz
kernel image: https://storage.googleapis.com/syzbot-assets/438e9e5cf49a/bzImage-1716a175.xz

The issue was bisected to:

commit 3d7cb67369a08d4933713290acf458990a50b6f9
Author: Suren Baghdasaryan <surenb@xxxxxxxxxx>
Date: Mon Feb 27 17:36:28 2023 +0000

x86/mm: try VMA lock-based page fault handling first

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10265502c80000
final oops: https://syzkaller.appspot.com/x/report.txt?x=12265502c80000
console output: https://syzkaller.appspot.com/x/log.txt?x=14265502c80000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+534d1c3c0c08473dc853@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 3d7cb67369a0 ("x86/mm: try VMA lock-based page fault handling first")

======================================================
WARNING: possible circular locking dependency detected
6.2.0-next-20230301-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor115/5084 is trying to acquire lock:
ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write include/linux/mm.h:678 [inline]
ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1826 [inline]
ffff888078307a90 (&vma->vm_lock->lock){++++}-{3:3}, at: collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204

but task is already holding lock:
ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: i_mmap_lock_write include/linux/fs.h:468 [inline]
ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1745 [inline]
ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: collapse_file+0x3da6/0x5980 mm/khugepaged.c:2204

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&mapping->i_mmap_rwsem){++++}-{3:3}:
down_write+0x92/0x200 kernel/locking/rwsem.c:1573
i_mmap_lock_write include/linux/fs.h:468 [inline]
dma_resv_lockdep+0x26f/0x5f0 drivers/dma-buf/dma-resv.c:760
do_one_initcall+0x141/0x7d0 init/main.c:1306
do_initcall_level init/main.c:1379 [inline]
do_initcalls init/main.c:1395 [inline]
do_basic_setup init/main.c:1414 [inline]
kernel_init_freeable+0x5ec/0x900 init/main.c:1634
kernel_init+0x1e/0x2c0 init/main.c:1522
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

-> #1 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire mm/page_alloc.c:4647 [inline]
fs_reclaim_acquire+0x11d/0x160 mm/page_alloc.c:4661
might_alloc include/linux/sched/mm.h:299 [inline]
prepare_alloc_pages+0x159/0x570 mm/page_alloc.c:5293
__alloc_pages+0x149/0x5c0 mm/page_alloc.c:5511
__folio_alloc+0x16/0x40 mm/page_alloc.c:5554
vma_alloc_folio+0x155/0x850 mm/mempolicy.c:2244
do_anonymous_page mm/memory.c:4062 [inline]
handle_pte_fault mm/memory.c:4917 [inline]
__handle_mm_fault+0x1857/0x3e70 mm/memory.c:5061
handle_mm_fault+0x2c0/0x9c0 mm/memory.c:5207
do_user_addr_fault+0x2c1/0x1210 arch/x86/mm/fault.c:1349
handle_page_fault arch/x86/mm/fault.c:1534 [inline]
exc_page_fault+0x98/0x170 arch/x86/mm/fault.c:1590
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570

-> #0 (&vma->vm_lock->lock){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3098 [inline]
check_prevs_add kernel/locking/lockdep.c:3217 [inline]
validate_chain kernel/locking/lockdep.c:3832 [inline]
__lock_acquire+0x2ec7/0x5d40 kernel/locking/lockdep.c:5056
lock_acquire.part.0+0x11a/0x370 kernel/locking/lockdep.c:5669
down_write+0x92/0x200 kernel/locking/rwsem.c:1573
vma_start_write include/linux/mm.h:678 [inline]
retract_page_tables mm/khugepaged.c:1826 [inline]
collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204
hpage_collapse_scan_file+0xcd3/0x1680 mm/khugepaged.c:2358
madvise_collapse+0x53b/0xca0 mm/khugepaged.c:2818
madvise_vma_behavior+0x649/0x20e0 mm/madvise.c:1086
madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1260
do_madvise.part.0+0x31c/0x470 mm/madvise.c:1439
do_madvise mm/madvise.c:1452 [inline]
__do_sys_madvise mm/madvise.c:1452 [inline]
__se_sys_madvise mm/madvise.c:1450 [inline]
__x64_sys_madvise+0x117/0x150 mm/madvise.c:1450
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Chain exists of:
&vma->vm_lock->lock --> fs_reclaim --> &mapping->i_mmap_rwsem

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&mapping->i_mmap_rwsem);
lock(fs_reclaim);
lock(&mapping->i_mmap_rwsem);
lock(&vma->vm_lock->lock);

*** DEADLOCK ***

2 locks held by syz-executor115/5084:
#0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: i_mmap_lock_write include/linux/fs.h:468 [inline]
#0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1745 [inline]
#0: ffff88801f93efa8 (&mapping->i_mmap_rwsem){++++}-{3:3}, at: collapse_file+0x3da6/0x5980 mm/khugepaged.c:2204
#1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: mmap_write_trylock include/linux/mmap_lock.h:120 [inline]
#1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: retract_page_tables mm/khugepaged.c:1797 [inline]
#1: ffff88807b06f098 (&mm->mmap_lock){++++}-{3:3}, at: collapse_file+0x4667/0x5980 mm/khugepaged.c:2204

stack backtrace:
CPU: 0 PID: 5084 Comm: syz-executor115 Not tainted 6.2.0-next-20230301-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2178
check_prev_add kernel/locking/lockdep.c:3098 [inline]
check_prevs_add kernel/locking/lockdep.c:3217 [inline]
validate_chain kernel/locking/lockdep.c:3832 [inline]
__lock_acquire+0x2ec7/0x5d40 kernel/locking/lockdep.c:5056
lock_acquire.part.0+0x11a/0x370 kernel/locking/lockdep.c:5669
down_write+0x92/0x200 kernel/locking/rwsem.c:1573
vma_start_write include/linux/mm.h:678 [inline]
retract_page_tables mm/khugepaged.c:1826 [inline]
collapse_file+0x4fa5/0x5980 mm/khugepaged.c:2204
hpage_collapse_scan_file+0xcd3/0x1680 mm/khugepaged.c:2358
madvise_collapse+0x53b/0xca0 mm/khugepaged.c:2818
madvise_vma_behavior+0x649/0x20e0 mm/madvise.c:1086
madvise_walk_vmas+0x1c7/0x2b0 mm/madvise.c:1260
do_madvise.part.0+0x31c/0x470 mm/madvise.c:1439
do_madvise mm/madvise.c:1452 [inline]
__do_sys_madvise mm/madvise.c:1452 [inline]
__se_sys_madvise mm/madvise.c:1450 [inline]
__x64_sys_madvise+0x117/0x150 mm/madvise.c:1450
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fcffa4a4b29
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe20f24e68 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcffa4a4b29
RDX: 0000000000000019 RSI: 0000000000600003 RDI: 0000000020000000
RBP: 00007fcffa468cd0 R08: 0000000000000000 R09: 0000000000000000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches