Re: [init/initramfs.c] e7cb072eb9: invoked_oom-killer:gfp_mask=0x

From: Oliver Sang
Date: Fri Jun 11 2021 - 11:13:53 EST


hi Rasmus,

On Fri, Jun 11, 2021 at 11:20:18AM +0200, Rasmus Villemoes wrote:
> On 11/06/2021 10.48, Oliver Sang wrote:
> > hi Rasmus,
> >
> > On Tue, Jun 08, 2021 at 09:42:58AM +0200, Rasmus Villemoes wrote:
> >> On 07/06/2021 16.44, kernel test robot wrote:
> >>>
>
>
> >> Also, I don't have 16G to give to a virtual machine. I tried running the
> >> bzImage with that modules.cgz under qemu with some naive parameters just
> >> to get some output [1], but other than failing because there's no rootfs
> >> to mount (as expected), I only managed to make it fail when providing
> >> too little memory (the .cgz is around 70M, decompressed about 200M -
> >> giving '-m 1G' to qemu works fine). You mention the vmalloc= argument,
> >> but I can't make the decompression fail when passing either vmalloc=128M
> >> or vmalloc=512M or no vmalloc= at all.
> >
> > sorry about this. we also tried to follow exactly above steps to test on
> > some local machine (8G memory), but cannot reproduce. we are analyzing
> > what's the diference in our automaion run in test cluster, which reproduced
> > the issue consistently. will update you when we have findings.
>
> OK. It's really odd that providing the VM with _more_ memory makes it
> fail (other then the obvious failure in the other direction when there's
> simply not enough memory for the unpacked initramfs itself). But
> unfortunately that also sounds like I won't be able to reproduce with
> the HW I have.
>
> >> As an extra data point, what happens if you add initramfs_async=0 to the
> >> command line?
> >
> > yes, we tested this before sending out the report. the issue gone
> > if initramfs_async=0 is added.
>
> Hm. Sounds like some initcall after rootfs_initcall time must
> allocate/hog a lot of memory, perhaps with some heuristic depending on
> how much is available.
>
> Can you try with initcall_debug=1? I think that should produce a lot of
> output, hopefully that would make it possible to see which initcalls
> have been done just prior to (or while) the initramfs unpacking hits ENOMEM.

thanks a lot for guidance!

by several jobs with initcall_debug=1, the oom happens consistently after
"calling panel_init_module"

below is an example, and detail dmesg.xz attached:

[ 15.120470] calling img_ascii_lcd_driver_init+0x0/0x16 @ 1
[ 15.122305] initcall img_ascii_lcd_driver_init+0x0/0x16 returned 0 after 128 usecs
[ 15.124660] calling panel_init_module+0x0/0x1f3 @ 1
[ 19.941826] kworker/u4:0 invoked oom-killer: gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0, oom_score_adj=0
[ 19.967679] CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.12.0-11533-ge7cb072eb988 #1
[ 19.968362] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 19.968362] Workqueue: events_unbound async_run_entry_fn
[ 19.968362] Call Trace:
[ 19.968362] dump_stack+0x69/0x85
[ 19.968362] dump_header+0x57/0x3ff
[ 19.968362] ? lock_release+0xa8/0x2b0
[ 19.968362] ? out_of_memory+0x310/0x4d0
[ 19.968362] out_of_memory.cold+0x2b/0x6f
[ 19.968362] __alloc_pages_slowpath+0xa3a/0xbb0
[ 19.968362] __alloc_pages+0x2a6/0x320
[ 19.968362] allocate_slab+0x242/0x300
[ 19.968362] ___slab_alloc+0x1e3/0x740
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] __slab_alloc+0x27/0x60
[ 19.968362] ? __slab_alloc+0x27/0x60
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] kmem_cache_alloc+0x24d/0x290
[ 19.968362] ? alloc_inode+0x77/0x80
[ 19.968362] alloc_inode+0x77/0x80
[ 19.968362] new_inode_pseudo+0xf/0x50
[ 19.968362] new_inode+0x17/0x30
[ 19.968362] ramfs_get_inode+0x35/0x150
[ 19.968362] ? _raw_spin_unlock_irqrestore+0x2d/0x60
[ 19.968362] ? trace_hardirqs_on+0x48/0xf0
[ 19.968362] ? _raw_spin_unlock_irqrestore+0x38/0x60
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] ramfs_mknod+0x36/0xa0
[ 19.968362] ? ramfs_mknod+0x36/0xa0
[ 19.968362] ? trace_preempt_on+0x36/0xd0
[ 19.968362] ? d_add+0xe4/0x1c0
[ 19.968362] ? ramfs_mkdir+0x50/0x50
[ 19.968362] ramfs_create+0x20/0x30
[ 19.968362] ? ramfs_create+0x20/0x30
[ 19.968362] path_openat+0x594/0xef0
[ 19.968362] do_filp_open+0x84/0xf0
[ 19.968362] ? lock_is_held_type+0x8a/0xf0
[ 19.968362] file_open_name+0xaf/0x1b0
[ 19.968362] filp_open+0x27/0x50
[ 19.968362] do_name+0xbc/0x27c
[ 19.968362] write_buffer+0x22/0x31
[ 19.968362] flush_buffer+0x26/0x74
[ 19.968362] __gunzip+0x204/0x28a
[ 19.968362] ? decompress_method+0x3b/0x3b
[ 19.968362] ? __gunzip+0x28a/0x28a
[ 19.968362] gunzip+0x16/0x18
[ 19.968362] ? write_buffer+0x31/0x31
[ 19.968362] ? initrd_load+0x23d/0x23d
[ 19.968362] unpack_to_rootfs+0x15d/0x273
[ 19.968362] ? write_buffer+0x31/0x31
[ 19.968362] ? initrd_load+0x23d/0x23d
[ 19.968362] do_populate_rootfs+0x5f/0x13f
[ 19.968362] async_run_entry_fn+0x24/0xb0
[ 19.968362] process_one_work+0x1ee/0x540
[ 19.968362] worker_thread+0x15e/0x3c0
[ 19.968362] kthread+0x109/0x140
[ 19.968362] ? process_one_work+0x540/0x540
[ 19.968362] ? kthread_insert_work_sanity_check+0x60/0x60
[ 19.968362] ret_from_fork+0x19/0x30
[ 20.217543] Mem-Info:
[ 20.218400] active_anon:0 inactive_anon:0 isolated_anon:0
[ 20.218400] active_file:0 inactive_file:0 isolated_file:0
[ 20.218400] unevictable:85038 dirty:0 writeback:0
[ 20.218400] slab_reclaimable:7910 slab_unreclaimable:1402
[ 20.218400] mapped:0 shmem:0 pagetables:4 bounce:0
[ 20.218400] free:3987860 free_pcp:117 free_cma:0
[ 20.226846] Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:340152kB isolated(anon):0kB isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:464kB pagetables:16kB all_unreclaimable? yes
[ 20.233380] DMA free:572kB min:2296kB low:2356kB high:2416kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:4664kB writepending:0kB present:15992kB managed:15916kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 20.240158] lowmem_reserve[]: 0 48 15946 15946
[ 20.241518] Normal free:2816kB min:2816kB low:3008kB high:3200kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:10008kB writepending:0kB present:481272kB managed:49344kB mlocked:0kB bounce:0kB free_pcp:28kB local_pcp:16kB free_cma:0kB
[ 20.248350] lowmem_reserve[]: 0 0 127183 127183
[ 20.249784] HighMem free:15948052kB min:512kB low:64120kB high:127728kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:325496kB writepending:0kB present:16279432kB managed:16279432kB mlocked:0kB bounce:0kB free_pcp:440kB local_pcp:208kB free_cma:0kB
[ 20.257038] lowmem_reserve[]: 0 0 0 0
[ 20.258272] DMA: 0*4kB 0*8kB 1*16kB (E) 1*32kB (E) 0*64kB 0*128kB 0*256kB 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 560kB
[ 20.261437] Normal: 12*4kB (UME) 4*8kB (UM) 3*16kB (UM) 2*32kB (UE) 1*64kB (U) 2*128kB (UM) 1*256kB (M) 2*512kB (ME) 1*1024kB (U) 0*2048kB 0*4096kB = 2816kB
[ 20.265294] HighMem: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 0*64kB 2*128kB (UM) 2*256kB (UM) 1*512kB (M) 1*1024kB (M) 2*2048kB (M) 3892*4096kB (M) = 15948052kB
[ 20.269081] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 20.271583] 85050 total pagecache pages
[ 20.272767] 4194174 pages RAM
[ 20.273839] 4069858 pages HighMem/MovableOnly
[ 20.275209] 108001 pages reserved
[ 20.276264] Tasks state (memory values in pages):
[ 20.277590] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 20.280179] Out of memory and no killable processes...
[ 20.281709] Kernel panic - not syncing: System is deadlocked on memory



>
> Thanks,
> Rasmus

Attachment: dmesg.xz
Description: application/xz