Re: general protection fault in madvise_cold_or_pageout_pte_range

From: Minchan Kim
Date: Mon Sep 14 2020 - 16:38:37 EST


On Mon, Sep 14, 2020 at 02:29:15AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 729e3d09 Merge tag 'ceph-for-5.9-rc5' of git://github.com/..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1482b99e900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=8f5c353182ed6199
> dashboard link: https://syzkaller.appspot.com/bug?extid=ecf80462cb7d5d552bc7
> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16e2a255900000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=164afdb3900000
>
> The issue was bisected to:
>
> commit 1a4e58cce84ee88129d5d49c064bd2852b481357
> Author: Minchan Kim <minchan@xxxxxxxxxx>
> Date: Wed Sep 25 23:49:15 2019 +0000
>
> mm: introduce MADV_PAGEOUT
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=127f973e900000
> final oops: https://syzkaller.appspot.com/x/report.txt?x=117f973e900000
> console output: https://syzkaller.appspot.com/x/log.txt?x=167f973e900000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+ecf80462cb7d5d552bc7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
>
> general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
> CPU: 1 PID: 6826 Comm: syz-executor142 Not tainted 5.9.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:__lock_acquire+0x84/0x2ae0 kernel/locking/lockdep.c:4296
> Code: ff df 8a 04 30 84 c0 0f 85 e3 16 00 00 83 3d 56 58 35 08 00 0f 84 0e 17 00 00 83 3d 25 c7 f5 07 00 74 2c 4c 89 e8 48 c1 e8 03 <80> 3c 30 00 74 12 4c 89 ef e8 3e d1 5a 00 48 be 00 00 00 00 00 fc
> RSP: 0018:ffffc90004b9f850 EFLAGS: 00010006
> RAX: 0000000000000003 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018
> RBP: ffffc90004b9f9a8 R08: 0000000000000001 R09: 0000000000000000
> R10: fffffbfff131e2e6 R11: 0000000000000000 R12: ffff8880937161c0
> R13: 0000000000000018 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000002638880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000002100003f CR3: 00000000a49a2000 CR4: 00000000001506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> lock_acquire+0x140/0x6f0 kernel/locking/lockdep.c:5006
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:354 [inline]
> madvise_cold_or_pageout_pte_range+0x52f/0x25c0 mm/madvise.c:389
> walk_pmd_range mm/pagewalk.c:89 [inline]
> walk_pud_range mm/pagewalk.c:160 [inline]
> walk_p4d_range mm/pagewalk.c:193 [inline]
> walk_pgd_range mm/pagewalk.c:229 [inline]
> __walk_page_range+0xe7b/0x1da0 mm/pagewalk.c:331
> walk_page_range+0x2c3/0x5c0 mm/pagewalk.c:427
> madvise_pageout_page_range mm/madvise.c:521 [inline]
> madvise_pageout mm/madvise.c:557 [inline]
> madvise_vma mm/madvise.c:946 [inline]
> do_madvise+0x12d0/0x2090 mm/madvise.c:1145
> __do_sys_madvise mm/madvise.c:1171 [inline]
> __se_sys_madvise mm/madvise.c:1169 [inline]
> __x64_sys_madvise+0x76/0x80 mm/madvise.c:1169
> do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xa9

It's the bug to access pmd again after split_huge_page of the pmd so pmd
would be NULL. Let me look at it.

Thanks.