Re: rsync: page allocation stalls in kernel 4.9.10 to a VessRAID NAS

From: Michal Hocko
Date: Tue Feb 28 2017 - 09:15:39 EST


On Mon 27-02-17 16:36:43, Robert Kudyba wrote:
> Feb 27 04:36:54 curie kernel: rsync: page allocation stalls for 10699ms, order:0, mode:0x2420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE)
> Feb 27 04:36:54 curie kernel: CPU: 2 PID: 32649 Comm: rsync Tainted: G L 4.9.10-200.fc25.i686+PAE #1
[...]

This is a lowmem request (aka only zones DMA and Normal can be used).

> Feb 27 04:36:55 curie kernel: DMA free:5584kB min:2580kB low:3224kB high:3868kB active_anon:0kB inactive_anon:0kB active_file:148kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15916kB mlocked:0kB slab_reclaimable:10032kB slab_unreclaimable:120kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> Feb 27 04:36:55 curie kernel: lowmem_reserve[]: 0 751 8055 8055

this one is protected by the lowmem_reserve

> Feb 27 04:36:55 curie kernel: Normal free:128408kB min:128488kB low:160608kB high:192728kB active_anon:0kB inactive_anon:0kB active_file:44340kB inactive_file:4720kB unevictable:0kB writepending:0kB present:892920kB managed:791608kB mlocked:0kB slab_reclaimable:569704kB slab_unreclaimable:29448kB kernel_stack:1168kB pagetables:0kB bounce:0kB free_pcp:1568kB local_pcp:236kB free_cma:0kB

and this one is hitting the min watermark while there is not really
much to reclaim. Only the page cache which might be pinned and not
reclaimable from this context because this is GFP_NOFS request. It is
not all that surprising the reclaim context fights to get some memory.
There is a huge amount of the reclaimable slab which probably just makes
a slow progress.

That is not something completely surprsing on 32b system I am afraid.

Btw. is the stall repeating with the increased time or it gets resolved
eventually?

--
Michal Hocko
SUSE Labs