Re: [PATCH] zram_drv: add __GFP_NOMEMALLOC not to use ALLOC_NO_WATERMARKS
From: Minchan Kim
Date: Mon Jun 06 2022 - 15:46:45 EST
On Fri, Jun 03, 2022 at 02:57:47PM +0900, Jaewon Kim wrote:
> The atomic page allocation failure sometimes happened, and most of them
> seem to occur during boot time.
>
> <4>[ 59.707645] system_server: page allocation failure: order:0, mode:0xa20(GFP_ATOMIC), nodemask=(null),cpuset=foreground-boost,mems_allowed=0
> <4>[ 59.707676] CPU: 5 PID: 1209 Comm: system_server Tainted: G S O 5.4.161-qgki-24219806-abA236USQU0AVE1 #1
> <4>[ 59.707691] Call trace:
> <4>[ 59.707702] dump_backtrace.cfi_jt+0x0/0x4
> <4>[ 59.707712] show_stack+0x18/0x24
> <4>[ 59.707719] dump_stack+0xa4/0xe0
> <4>[ 59.707728] warn_alloc+0x114/0x194
> <4>[ 59.707734] __alloc_pages_slowpath+0x828/0x83c
> <4>[ 59.707740] __alloc_pages_nodemask+0x2b4/0x310
> <4>[ 59.707747] alloc_slab_page+0x40/0x5c8
> <4>[ 59.707753] new_slab+0x404/0x420
> <4>[ 59.707759] ___slab_alloc+0x224/0x3b0
> <4>[ 59.707765] __kmalloc+0x37c/0x394
> <4>[ 59.707773] context_struct_to_string+0x110/0x1b8
> <4>[ 59.707778] context_add_hash+0x6c/0xc8
> <4>[ 59.707785] security_compute_sid.llvm.13699573597798246927+0x508/0x5d8
> <4>[ 59.707792] security_transition_sid+0x2c/0x38
> <4>[ 59.707804] selinux_socket_create+0xa0/0xd8
> <4>[ 59.707811] security_socket_create+0x68/0xbc
> <4>[ 59.707818] __sock_create+0x8c/0x2f8
> <4>[ 59.707823] __sys_socket+0x94/0x19c
> <4>[ 59.707829] __arm64_sys_socket+0x20/0x30
> <4>[ 59.707836] el0_svc_common+0x100/0x1e0
> <4>[ 59.707841] el0_svc_handler+0x68/0x74
> <4>[ 59.707848] el0_svc+0x8/0xc
> <4>[ 59.707853] Mem-Info:
> <4>[ 59.707890] active_anon:223569 inactive_anon:74412 isolated_anon:0
> <4>[ 59.707890] active_file:51395 inactive_file:176622 isolated_file:0
> <4>[ 59.707890] unevictable:1018 dirty:211 writeback:4 unstable:0
> <4>[ 59.707890] slab_reclaimable:14398 slab_unreclaimable:61909
> <4>[ 59.707890] mapped:134779 shmem:1231 pagetables:26706 bounce:0
> <4>[ 59.707890] free:528 free_pcp:844 free_cma:147
> <4>[ 59.707900] Node 0 active_anon:894276kB inactive_anon:297648kB active_file:205580kB inactive_file:706488kB unevictable:4072kB isolated(anon):0kB isolated(file):0kB mapped:539116kB dirty:844kB writeback:16kB shmem:4924kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> <4>[ 59.707912] Normal free:2112kB min:7244kB low:68892kB high:72180kB active_anon:893140kB inactive_anon:297660kB active_file:204740kB inactive_file:706396kB unevictable:4072kB writepending:860kB present:3626812kB managed:3288700kB mlocked:4068kB kernel_stack:62416kB shadow_call_stack:15656kB pagetables:106824kB bounce:0kB free_pcp:3372kB local_pcp:176kB free_cma:588kB
> <4>[ 59.707915] lowmem_reserve[]: 0 0
> <4>[ 59.707922] Normal: 8*4kB (H) 5*8kB (H) 13*16kB (H) 25*32kB (H) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1080kB
> <4>[ 59.707942] 242549 total pagecache pages
> <4>[ 59.707951] 12446 pages in swap cache
> <4>[ 59.707956] Swap cache stats: add 212408, delete 199969, find 36869/71571
> <4>[ 59.707961] Free swap = 3445756kB
> <4>[ 59.707965] Total swap = 4194300kB
> <4>[ 59.707969] 906703 pages RAM
> <4>[ 59.707973] 0 pages HighMem/MovableOnly
> <4>[ 59.707978] 84528 pages reserved
> <4>[ 59.707982] 49152 pages cma reserved
>
> The kswapd or other reclaim contexts may not prepare enough free pages
> for too many atomic allocations occurred in short time. But zram may not
> be helpful for this atomic allocation even though zram is used to
> reclaim.
>
> To get one zs object for a specific size, zram may allocate serveral
> pages. And this can be happened on different class sizes at the same
> time. It means zram may consume more pages to reclaim only one page.
> This inefficiency may consume all free pages below watmerk min by a
> process having PF_MEMALLOC like kswapd.
However, that's how zram has worked for a long time(allocate memory
under memory pressure) and many folks already have raised min_free_kbytes
when they use zram as swap. If we don't allow the allocation, swap out
fails easier than old, which would break existing tunes.
>
> We can avoid this by adding __GFP_NOMEMALLOC. PF_MEMALLOC process won't
> use ALLOC_NO_WATERMARKS.
>
> Signed-off-by: Jaewon Kim <jaewon31.kim@xxxxxxxxxxx>
> ---
> drivers/block/zram/zram_drv.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index b8549c61ff2c..39cd1397ed3b 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -1383,6 +1383,7 @@ static int __zram_bvec_write(struct zram *zram, struct bio_vec *bvec,
>
> handle = zs_malloc(zram->mem_pool, comp_len,
> __GFP_KSWAPD_RECLAIM |
> + __GFP_NOMEMALLOC |
> __GFP_NOWARN |
> __GFP_HIGHMEM |
> __GFP_MOVABLE);
> --
> 2.17.1
>
>