Re: [syzbot] [mm?] [bcachefs?] WARNING in lock_list_lru_of_memcg
From: Kairui Song
Date: Sun Feb 16 2025 - 11:15:58 EST
On Sat, Feb 15, 2025 at 7:24 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, 14 Feb 2025 10:11:19 -0800 syzbot <syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> > syzbot has found a reproducer for the following issue on:
>
> Thanks. I doubt if bcachefs is implicated in this?
>
> > HEAD commit: 128c8f96eb86 Merge tag 'drm-fixes-2025-02-14' of https://g..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=148019a4580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=c776e555cfbdb82d
> > dashboard link: https://syzkaller.appspot.com/bug?extid=38a0cbd267eff2d286ff
> > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12328bf8580000
> >
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-128c8f96.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/a97f78ac821e/vmlinux-128c8f96.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/f451cf16fc9f/bzImage-128c8f96.xz
> > mounted in repro: https://storage.googleapis.com/syzbot-assets/a7da783f97cf/mount_3.gz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+38a0cbd267eff2d286ff@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 5459 at mm/list_lru.c:96 lock_list_lru_of_memcg+0x39e/0x4d0 mm/list_lru.c:96
>
> VM_WARN_ON(!css_is_dying(&memcg->css));
I'm checking this, when last time this was triggered, it was caused by
a list_lru user did not initialize the memcg list_lru properly before
list_lru reclaim started, and fixed by:
https://lore.kernel.org/all/20241222122936.67501-1-ryncsn@xxxxxxxxx/T/
This shouldn't be a big issue, maybe there are leaks that will be
fixed upon reparenting, and this new added sanity check might be too
lenient, I'm not 100% sure though.
Unfortunately I couldn't reproduce the issue locally with the
reproducer yet. will keep the test running and see if it can hit this
WARN_ON.