Re: [syzbot] [mm?] kernel BUG in move_pages

From: David Hildenbrand
Date: Thu Jan 11 2024 - 16:01:01 EST


On 11.01.24 21:20, Suren Baghdasaryan wrote:
On Thu, Jan 11, 2024 at 6:58 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 11.01.24 19:34, Suren Baghdasaryan wrote:
On Thu, Jan 11, 2024 at 8:44 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:

On Thu, Jan 11, 2024 at 8:40 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:

On Thu, Jan 11, 2024 at 8:25 AM syzbot
<syzbot+705209281e36404998f6@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Hello,

syzbot found the following issue on:

HEAD commit: e2425464bc87 Add linux-next specific files for 20240105
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=14941cdee80000
kernel config: https://syzkaller.appspot.com/x/.config?x=4056b9349f3da8c9
dashboard link: https://syzkaller.appspot.com/bug?extid=705209281e36404998f6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=125d0a09e80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15bc7331e80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2f738185e2cf/disk-e2425464.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b248fcf4ea46/vmlinux-e2425464.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a9945c8223f4/bzImage-e2425464.xz

The issue was bisected to:

commit adef440691bab824e39c1b17382322d195e1fab0
Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Date: Wed Dec 6 10:36:56 2023 +0000

userfaultfd: UFFDIO_MOVE uABI

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11cb6ea9e80000
final oops: https://syzkaller.appspot.com/x/report.txt?x=13cb6ea9e80000
console output: https://syzkaller.appspot.com/x/log.txt?x=15cb6ea9e80000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+705209281e36404998f6@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")

do_one_initcall+0x128/0x680 init/main.c:1237
do_initcall_level init/main.c:1299 [inline]
do_initcalls init/main.c:1315 [inline]
do_basic_setup init/main.c:1334 [inline]
kernel_init_freeable+0x692/0xc30 init/main.c:1552
kernel_init+0x1c/0x2a0 init/main.c:1442
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:1035!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5068 Comm: syz-executor191 Not tainted 6.7.0-rc8-next-20240105-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
RIP: 0010:PageAnonExclusive include/linux/page-flags.h:1035 [inline]

From a quick look, I think the new ioctl is being used against a
file-backed page and that's why PageAnonExclusive() throws this error.
I'll confirm if this is indeed the case and will add checks for that
case. Thanks!

Hmm. Looking at the reproducer it does not look like a file-backed
memory... Anyways, I'm on it.

Looks like the test is trying to move the huge_zero_page. Wonder how
we should handle this. Just fail or do something else? Adding David
and Peter for feedback.

You'll need some special-casing to handle that. But it should be fairly
easy.

Ok, so should we treat zeropage the same as PAE and map destination
PTE/PMD to zeropage while clearing source PTE/PMD?

Likely yes. So it's transparent for user space what we are moving. (this sounds like an easy case to not require a prior write access just to move it)

--
Cheers,

David / dhildenb