Re: [syzbot] BUG: sleeping function called from invalid context in vm_area_dup

From: Aleksandr Nogikh
Date: Fri Oct 21 2022 - 20:22:44 EST


On Fri, Oct 21, 2022 at 4:50 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Fri, Oct 21, 2022 at 4:12 PM Aleksandr Nogikh <nogikh@xxxxxxxxxx> wrote:
> >
> > On Fri, Oct 21, 2022 at 2:52 PM 'Suren Baghdasaryan' via
> > syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, Oct 20, 2022 at 6:58 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > > >
> > > > On Thu, Oct 20, 2022 at 6:22 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, 20 Oct 2022 05:40:43 -0700 syzbot <syzbot+b910411d3d253dab25d8@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > > syzbot has found a reproducer for the following issue on:
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > > HEAD commit: acee3e83b493 Add linux-next specific files for 20221020
> > > > > > git tree: linux-next
> > > > > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=170a8016880000
> > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=c82245cfb913f766
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=b910411d3d253dab25d8
> > > > > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=109e0372880000
> > > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1770d752880000
> > > > > >
> > > > > > Downloadable assets:
> > > > > > disk image: https://storage.googleapis.com/syzbot-assets/98cc5896cded/disk-acee3e83.raw.xz
> > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/b3d3eb3aa10a/vmlinux-acee3e83.xz
> > > > > >
> > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > Reported-by: syzbot+b910411d3d253dab25d8@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > > >
> > > > > > BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
> > > > >
> > > > > This is happening under dup_anon_vma_name().
> > > > >
> > > > > I can't spot preemption being disabled on that call path, and I assume
> > > > > this code has been exercised for some time.
> > > >
> > > > Indeed, it is unclear why copy_vma() would be called in atomic
> > > > context. I'll try to reproduce tomorrow. Maybe with lockdep enabled we
> > > > can get something interesting.
> > >
> > > Sorry for the delay. Having trouble booting the image built with the
> > > attached config. My qemu crashes with a "sched: CPU #1's llc-sibling
> > > CPU #0 is not on the same node! [node: 1 != 0]." warning before the
> > > crash. Trying to figure out why.
> >
> > qemu 6.2 changed the core-to-socket assignment and it looks like we
> > get such errors when a kernel with "numa=fake=" is run under qemu on a
> > system with multiple CPUs.
> >
> > You can try removing numa=fake=... from the CMDLINE config or just
> > manually setting the smp argument of the qemu process (e.g. -smp
> > 2,sockets=2,cores=1)
> >
> > See https://gitlab.com/qemu-project/qemu/-/issues/877
>
> That was it. Thank you, Aleksandr!
> I can boot with the image built using the attached config but still
> can't reproduce the issue using the C reproducer... Will keep it
> running for some time to see if it eventually shows up.

Just in case -- did you also try executing the reproducer against the
attached bootable disk image? Syzbot attaches the exact images on
which it managed to find the bug. The image should work for both GCE
and qemu.

> Thanks,
> Suren.
>
> >
> > > defconfig with CONFIG_ANON_VMA_NAME=y boots fine but does not
> > > reproduce the issue.
> > >
> > > >
> > > > >
> > > > > I wonder if this could be fallout from the KSM locking error which
> > > > > https://lkml.kernel.org/r/8c86678a-3bfb-3854-b1a9-ae5969e730b8@xxxxxxxxxx
> > > > > addresses. Seems quite unlikely.
> > > > >
> > > > > > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3602, name: syz-executor107
> > > > > > preempt_count: 1, expected: 0
> > > > > > RCU nest depth: 0, expected: 0
> > > > > > INFO: lockdep is turned off.
> > > > > > Preemption disabled at:
> > > > > > [<0000000000000000>] 0x0
> > > > > > CPU: 0 PID: 3602 Comm: syz-executor107 Not tainted 6.1.0-rc1-next-20221020-syzkaller #0
> > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022
> > > > > > Call Trace:
> > > > > > <TASK>
> > > > > > __dump_stack lib/dump_stack.c:88 [inline]
> > > > > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> > > > > > __might_resched.cold+0x222/0x26b kernel/sched/core.c:9890
> > > > > > might_alloc include/linux/sched/mm.h:274 [inline]
> > > > > > slab_pre_alloc_hook mm/slab.h:727 [inline]
> > > > > > slab_alloc_node mm/slub.c:3323 [inline]
> > > > > > slab_alloc mm/slub.c:3411 [inline]
> > > > > > __kmem_cache_alloc_lru mm/slub.c:3418 [inline]
> > > > > > kmem_cache_alloc+0x2e6/0x3c0 mm/slub.c:3427
> > > > > > vm_area_dup+0x81/0x380 kernel/fork.c:466
> > > > > > copy_vma+0x376/0x8d0 mm/mmap.c:3216
> > > > > > move_vma+0x449/0xf60 mm/mremap.c:626
> > > > > > __do_sys_mremap+0x487/0x16b0 mm/mremap.c:1075
> > > > > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > > > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
> > > > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > > > RIP: 0033:0x7fd090fa5b29
> > > > > > Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
> > > > > > RSP: 002b:00007ffc2e90bd38 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
> > > > > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd090fa5b29
> > > > > > RDX: 0000000000001000 RSI: 0000000000004000 RDI: 00000000201c4000
> > > > > > RBP: 00007fd090f69cd0 R08: 00000000202ef000 R09: 0000000000000000
> > > > > > R10: 0000000000000003 R11: 0000000000000246 R12: 00007fd090f69d60
> > > > > > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > > > > > </TASK>
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxx.
> > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/CAJuCfpF7xsZJevfj6ERsJi5tPFj0o6FATAm4k%3DCMsONFG86EmQ%40mail.gmail.com.