Re: [syzbot] KASAN: invalid-access Read in copy_page
From: Catalin Marinas
Date: Tue Sep 06 2022 - 12:44:58 EST
On Tue, Sep 06, 2022 at 04:39:57PM +0200, Andrey Konovalov wrote:
> On Tue, Sep 6, 2022 at 4:29 PM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > > > Does it take long to reproduce this kasan warning?
> > >
> > > syzbot finds several such cases every day (200 crashes for the past 35 days):
> > > https://syzkaller.appspot.com/bug?extid=c2c79c6d6eddc5262b77
> > > So once it reaches the tested tree, we should have an answer within a day.
>
> To be specific, this syzkaller instance fuzzes the mainline, so the
> patch with the WARN_ON needs to end up there.
>
> If this is unacceptable, perhaps, we could switch the MTE syzkaller
> instance to the arm64 testing tree.
It needs some more digging first. My first guess was that a PROT_MTE
page was mapped into the user address space and the task repainted it
but I don't think that's the case.
> > That's good to know. BTW, does syzkaller write tags in mmap'ed pages or
> > only issues random syscalls?
>
> syzkaller doesn't write tags. Or, at least, shouldn't. Theoretically
> it could come up with same way to generate instructions that write
> tags, but this is unlikely.
Yeah. And colouring an entire page with the same tag is even less
likely.
> > I'm trying to figure out whether tag 0xf2
> > was written by the kernel without updating the corresponding
> > page_kasan_tag() or it was syzkaller recolouring the page.
>
> Just in case, I want to point out that the kasantag == 0xa from the
> page flags matches the pointer tag 0xf5 in the report. The tag value
> is stored bitwise-inverted in the page flags. Not that this matters in
> this case though.
Yes, I'm aware of this. So copy_page() tries to read from
page_address(src) with kasantag == 0xa (real tag 0xf5) while the
in-memory tag is 0xf2. Since the user didn't repaint the page, I'm
trying to figure out what set the tags to 0xf2 while leaving the
page_kasan_tag() to 0xf5. Some of the page_kasan_tag_reset() calls in
the past could have hidden a different issue.
Since I can't find the kernel boot log for these runs, is there any kind
of swap enabled? I'm trying to narrow down where the problem may be.
--
Catalin