Re: [syzbot] [mm?] [cgroups?] WARNING: bad unlock balance in lruvec_stat_mod_folio

From: Qi Zheng

Date: Mon Apr 13 2026 - 23:52:48 EST


Hi Shakeel,

On 4/14/26 6:28 AM, Shakeel Butt wrote:
+Qi & Yosry

On Tue, Apr 07, 2026 at 10:53:24AM -0700, syzbot wrote:
Hello,

syzbot found the following issue on:

HEAD commit: cc13002a9f98 Add linux-next specific files for 20260402
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10d8946a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=4e6c8be618ab359
dashboard link: https://syzkaller.appspot.com/bug?extid=1a3353a77896e73a8f53
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Let's wait for the reproducer. I can only think of cgroup_subsys_on_dfl() check
returning different value in get_non_dying_memcg_start() and
get_non_dying_memcg_end() to cause this uneven rcu unlock. However I can't think
why and how that can happen.


My AI bot told me that the cgroup_subsys_on_dfl_key can be dynamically
modified at runtime during a rebind:

rebind_subsystems()
--> if (dst_root == &cgrp_dfl_root) {
static_branch_enable(cgroup_subsys_on_dfl_key[ssid]);
} else {
dcgrp->subtree_control |= 1 << ssid;
static_branch_disable(cgroup_subsys_on_dfl_key[ssid]);
}

However, when I actually tested it, I hit the following error:

mount: /tmp/cg-rb-repro: mount point is busy.

Indeed, there are already many child cgroups under the cgroup v2 root
(the VM just booted):

root@localhost:~# find /sys/fs/cgroup -mindepth 1 -maxdepth 2 -type d | head -50
/sys/fs/cgroup/sys-kernel-debug.mount
/sys/fs/cgroup/dev-mqueue.mount
/sys/fs/cgroup/user.slice
/sys/fs/cgroup/user.slice/user-0.slice
/sys/fs/cgroup/sys-kernel-tracing.mount
/sys/fs/cgroup/init.scope
/sys/fs/cgroup/system.slice
/sys/fs/cgroup/system.slice/systemd-networkd.service
/sys/fs/cgroup/system.slice/systemd-udevd.service
/sys/fs/cgroup/system.slice/system-serial\x2dgetty.slice
/sys/fs/cgroup/system.slice/wpa_supplicant.service
/sys/fs/cgroup/system.slice/system-modprobe.slice
/sys/fs/cgroup/system.slice/systemd-journald.service
/sys/fs/cgroup/system.slice/unattended-upgrades.service
/sys/fs/cgroup/system.slice/system-systemd\x2dgrowfs.slice
/sys/fs/cgroup/system.slice/ssh.service
/sys/fs/cgroup/system.slice/dhcpcd.service
/sys/fs/cgroup/system.slice/systemd-resolved.service
/sys/fs/cgroup/system.slice/dbus.service
/sys/fs/cgroup/system.slice/systemd-timesyncd.service
/sys/fs/cgroup/system.slice/system-getty.slice
/sys/fs/cgroup/system.slice/systemd-logind.service
/sys/fs/cgroup/dev-hugepages.mount

So it seems impossible to rebind memory in a production environment
using systemd?

Then I disabled systemd:

set `init=/bin/bash`

and found that I could successfully run the following commands:

root@(none):/# mkdir -p /tmp/cg-rb-repro
root@(none):/# mount -t cgroup -o none,name=rb none /tmp/cg-rb-repro
root@(none):/# mount -t cgroup -o remount,memory none /tmp/cg-rb-repro
[ 65.903125][ T241] option changes via remount are deprecated (pid=241 comm=mount)
root@(none):/# mount -t cgroup -o remount,name=rb none /tmp/cg-rb-repro
[ 73.405829][ T242] option changes via remount are deprecated (pid=242 comm=mount)
root@(none):/# umount /tmp/cg-rb-repro

So it seems this race condition does exist. Should we fix it?

Thanks,
Qi