Re: [syzbot] [fs?] WARNING in pagemap_scan_pmd_entry

From: Peter Xu
Date: Thu Nov 16 2023 - 11:49:27 EST


On Thu, Nov 16, 2023 at 07:38:00AM -0800, Andrei Vagin wrote:
> On Wed, Nov 15, 2023 at 4:53 PM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > Hi, Andrei, Muhammad,
> >
> > I had a look (as it triggered the guard I added before..), and I think I
> > know what happened. So far I think it's a question to the new ioctl()
> > interface, which I'd like to double check with you all. See below.
> >
> > On Wed, Nov 15, 2023 at 01:07:18PM -0800, Andrei Vagin wrote:
> > > Cc: Peter and Muhammad
> > >
> > > On Wed, Nov 15, 2023 at 6:41 AM syzbot
> > > <syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: c42d9eeef8e5 Merge tag 'hardening-v6.7-rc2' of git://git.k..
> > > > git tree: upstream
> > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13626650e80000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=84217b7fc4acdc59
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=e94c5aaf7890901ebf9b
> > > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15d73be0e80000
> > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13670da8e80000
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/a595d90eb9af/disk-c42d9eee.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/c1e726fedb94/vmlinux-c42d9eee.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/cb43ae262d09/bzImage-c42d9eee.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+e94c5aaf7890901ebf9b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > >
> > > > ------------[ cut here ]------------
> > > > WARNING: CPU: 1 PID: 5071 at arch/x86/include/asm/pgtable.h:403 pte_uffd_wp arch/x86/include/asm/pgtable.h:403 [inline]
> >
> > This is the guard I added to detect writable bit set even if uffd-wp bit is
> > not yet cleared. It means something obviously wrong happened.
> >
> > Here afaict the wrong thing is ioctl(PAGEMAP_SCAN) allows applying uffd-wp
> > bit to VMA that is not even registered with userfault. Then what happened
> > is when the page is written, do_wp_page() will try to reuse the anonymous
> > page with the uffd-wp bit set, set W bit on top of it.
>
> Thank you for looking at this.
>
> >
> > Below change works for me:
> >
> > ===8<===
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index ef2eb12906da..8a2500fa4580 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -1987,6 +1987,12 @@ static int pagemap_scan_test_walk(unsigned long start, unsigned long end,
> > vma_category |= PAGE_IS_WPALLOWED;
> > else if (p->arg.flags & PM_SCAN_CHECK_WPASYNC)
> > return -EPERM;
> > + else
> > + /*
> > + * Neither has the VMA enabled WP tracking, nor does the
> > + * user want to explicit fail the walk. Skip the vma.
> > + */
> > + return 1;
>
> In this case, I think we need to check the PM_SCAN_WP_MATCHING flag
> and skip these vma-s only if it is set.
>
> If PM_SCAN_WP_MATCHING isn't set, this ioctl returns page flags and
> can be used without the intention of tracking memory changes.
>
> >
> > if (vma->vm_flags & VM_PFNMAP)
> > return 1;
> > ===8<===
> >
> > This is based on my reading of the pagemap scan flags:
> >
> > - Write-protect the pages. The ``PM_SCAN_WP_MATCHING`` is used to write-protect
> > the pages of interest. The ``PM_SCAN_CHECK_WPASYNC`` aborts the operation if
> > non-Async Write Protected pages are found. The ``PM_SCAN_WP_MATCHING`` can be
> > used with or without ``PM_SCAN_CHECK_WPASYNC``.
> >
> > If PM_SCAN_CHECK_WPASYNC is used to enforce the check, we need to skip the
> > vma that is not registered properly. Does it look reasonable to you?
>
> I think the idea here could be to report page flags but doesn't
> write-protect such pages.

Ah, I think I understand slightly better now. Below is my 2nd try..

Meanwhile, I think this won't work:

/* 9. Memory mapped file */
fd = open(__FILE__, O_RDONLY);
if (fd < 0)
ksft_exit_fail_msg("%s Memory mapped file\n", __func__);

We can't assume __FILE__ is there.. Attached one more patch for that.
I'll repost formally if that looks good to you.

===8<===