Re: [syzbot] [mm?] [cgroups?] WARNING: bad unlock balance in lruvec_stat_mod_folio

From: Qi Zheng

Date: Tue Apr 14 2026 - 22:24:11 EST

On 4/15/26 1:15 AM, Shakeel Butt wrote:

On Tue, Apr 14, 2026 at 11:52:13AM +0800, Qi Zheng wrote:

Hi Shakeel,

On 4/14/26 6:28 AM, Shakeel Butt wrote:

+Qi & Yosry

On Tue, Apr 07, 2026 at 10:53:24AM -0700, syzbot wrote:

Hello,

syzbot found the following issue on:

HEAD commit: cc13002a9f98 Add linux-next specific files for 20260402
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10d8946a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=4e6c8be618ab359
dashboard link: https://syzkaller.appspot.com/bug?extid=1a3353a77896e73a8f53
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Let's wait for the reproducer. I can only think of cgroup_subsys_on_dfl() check
returning different value in get_non_dying_memcg_start() and
get_non_dying_memcg_end() to cause this uneven rcu unlock. However I can't think
why and how that can happen.

My AI bot told me that the cgroup_subsys_on_dfl_key can be dynamically
modified at runtime during a rebind:

rebind_subsystems()
--> if (dst_root == &cgrp_dfl_root) {
static_branch_enable(cgroup_subsys_on_dfl_key[ssid]);
} else {
dcgrp->subtree_control |= 1 << ssid;
static_branch_disable(cgroup_subsys_on_dfl_key[ssid]);
}

However, when I actually tested it, I hit the following error:

mount: /tmp/cg-rb-repro: mount point is busy.

Indeed, there are already many child cgroups under the cgroup v2 root
(the VM just booted):

root@localhost:~# find /sys/fs/cgroup -mindepth 1 -maxdepth 2 -type d | head
-50
/sys/fs/cgroup/sys-kernel-debug.mount
/sys/fs/cgroup/dev-mqueue.mount
/sys/fs/cgroup/user.slice
/sys/fs/cgroup/user.slice/user-0.slice
/sys/fs/cgroup/sys-kernel-tracing.mount
/sys/fs/cgroup/init.scope
/sys/fs/cgroup/system.slice
/sys/fs/cgroup/system.slice/systemd-networkd.service
/sys/fs/cgroup/system.slice/systemd-udevd.service
/sys/fs/cgroup/system.slice/system-serial\x2dgetty.slice
/sys/fs/cgroup/system.slice/wpa_supplicant.service
/sys/fs/cgroup/system.slice/system-modprobe.slice
/sys/fs/cgroup/system.slice/systemd-journald.service
/sys/fs/cgroup/system.slice/unattended-upgrades.service
/sys/fs/cgroup/system.slice/system-systemd\x2dgrowfs.slice
/sys/fs/cgroup/system.slice/ssh.service
/sys/fs/cgroup/system.slice/dhcpcd.service
/sys/fs/cgroup/system.slice/systemd-resolved.service
/sys/fs/cgroup/system.slice/dbus.service
/sys/fs/cgroup/system.slice/systemd-timesyncd.service
/sys/fs/cgroup/system.slice/system-getty.slice
/sys/fs/cgroup/system.slice/systemd-logind.service
/sys/fs/cgroup/dev-hugepages.mount

So it seems impossible to rebind memory in a production environment
using systemd?

Then I disabled systemd:

set `init=/bin/bash`

and found that I could successfully run the following commands:

root@(none):/# mkdir -p /tmp/cg-rb-repro
root@(none):/# mount -t cgroup -o none,name=rb none /tmp/cg-rb-repro
root@(none):/# mount -t cgroup -o remount,memory none /tmp/cg-rb-repro
[ 65.903125][ T241] option changes via remount are deprecated (pid=241
comm=mount)
root@(none):/# mount -t cgroup -o remount,name=rb none /tmp/cg-rb-repro
[ 73.405829][ T242] option changes via remount are deprecated (pid=242
comm=mount)
root@(none):/# umount /tmp/cg-rb-repro

So it seems this race condition does exist. Should we fix it?

This only succeeded because there weren't any active cgroups. Were you able to
trigger the warning as well. If not, I think we should just wait for

Nope.

reproducer from syzbot before doing anything.

OK, Let's wait for syzbot to reproduce it.