Re: upstream boot error: WARNING in mem_cgroup_css_alloc

From: Michal Hocko
Date: Thu Aug 13 2020 - 06:44:20 EST


This smells like 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to
the parent cgroup").

On Thu 13-08-20 02:47:26, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: dc06fe51 Merge tag 'rtc-5.9' of git://git.kernel.org/pub/s..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12d34bd6900000
> kernel config: https://syzkaller.appspot.com/x/.config?x=9c89856ae5fc8b6
> dashboard link: https://syzkaller.appspot.com/bug?extid=35d35ef677516d65bdda
> compiler: gcc (GCC) 10.1.0-syz 20200507
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+35d35ef677516d65bdda@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> mem auto-init: stack:off, heap alloc:on, heap free:off
> Memory: 6461180K/7863916K available (116743K kernel code, 17665K rwdata, 21624K rodata, 2792K init, 23940K bss, 1402480K reserved, 0K cma-reserved)
> Running RCU self tests
> rcu: Preemptible hierarchical RCU implementation.
> rcu: RCU lockdep checking is enabled.
> rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=2.
> rcu: RCU callback double-/use-after-free debug enabled.
> rcu: RCU debug extended QS entry/exit.
> All grace periods are expedited (rcu_expedited).
> Trampoline variant of Tasks RCU enabled.
> rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
> rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
> NR_IRQS: 4352, nr_irqs: 440, preallocated irqs: 16
> random: get_random_bytes called from boot_init_stack_canary arch/x86/include/asm/stackprotector.h:80 [inline] with crng_init=0
> random: get_random_bytes called from start_kernel+0x23b/0x46a init/main.c:957 with crng_init=0
> Console: colour VGA+ 80x25
> printk: console [ttyS0] enabled
> printk: console [ttyS0] enabled
> printk: bootconsole [earlyser0] disabled
> printk: bootconsole [earlyser0] disabled
> Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> ... MAX_LOCKDEP_SUBCLASSES: 8
> ... MAX_LOCK_DEPTH: 48
> ... MAX_LOCKDEP_KEYS: 8192
> ... CLASSHASH_SIZE: 4096
> ... MAX_LOCKDEP_ENTRIES: 32768
> ... MAX_LOCKDEP_CHAINS: 65536
> ... CHAINHASH_SIZE: 32768
> memory used by lock dependency info: 6301 kB
> memory used for stack traces: 4224 kB
> per task-struct memory footprint: 1920 bytes
> mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl
> ACPI: Core revision 20200717
> APIC: Switch to symmetric I/O mode setup
> x2apic enabled
> Switched APIC routing to physical x2apic.
> ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
> clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x212735223b2, max_idle_ns: 440795277976 ns
> Calibrating delay loop (skipped) preset value.. 4600.00 BogoMIPS (lpj=23000000)
> pid_max: default: 32768 minimum: 301
> LSM: Security Framework initializing
> LSM: security= is ignored because it is superseded by lsm=
> Yama: becoming mindful.
> TOMOYO Linux initialized
> AppArmor: AppArmor initialized
> LSM support for eBPF active
> Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, vmalloc)
> Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc)
> Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, vmalloc)
> Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, vmalloc)
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5226 memalloc_unuse_memcg include/linux/sched/mm.h:331 [inline]
> WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5226 mem_cgroup_css_alloc+0x535/0x1c30 mm/memcontrol.c:5285
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.8.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x18f/0x20d lib/dump_stack.c:118
> panic+0x2e3/0x75c kernel/panic.c:231
> __warn.cold+0x20/0x4a kernel/panic.c:600
> report_bug+0x1bd/0x210 lib/bug.c:198
> handle_bug+0x38/0x90 arch/x86/kernel/traps.c:234
> exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:254
> asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:536
> RIP: 0010:mem_cgroup_alloc mm/memcontrol.c:5226 [inline]
> RIP: 0010:mem_cgroup_css_alloc+0x535/0x1c30 mm/memcontrol.c:5284
> Code: 01 00 48 8d bb 28 14 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 74 81 e8 e0 00 c2 f9 e9 77 ff ff ff <0f> 0b e9 21 fc ff ff 48 89 ef e8 bc 28 c7 f9 48 b8 00 00 00 00 00
> RSP: 0000:ffffffff89a07e20 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: ffffffff89a99dc0 RCX: 1ffff110152b2028
> RDX: 1ffffffff135363d RSI: ffffffff83b0f2c2 RDI: ffffffff89a9b1e8
> RBP: ffff8880a9590000 R08: 0000000000000001 R09: ffff88821b801aef
> R10: 0000000000000000 R11: ffffffff810000d4 R12: ffffffff8ab224b4
> R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8ab224d0
> cgroup_init_subsys+0x215/0x4d7 kernel/cgroup/cgroup.c:5587
> cgroup_init+0x359/0xa63 kernel/cgroup/cgroup.c:5713
> start_kernel+0x426/0x46a init/main.c:1035
> secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
> Rebooting in 86400 seconds..
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

--
Michal Hocko
SUSE Labs