Re: getting oom/stalls for ltp test cpuset01 with latest/4.9 kernel
From: Vlastimil Babka
Date: Wed Jan 11 2017 - 06:06:17 EST
On 01/11/2017 11:50 AM, Ganapatrao Kulkarni wrote:
> Hi,
>
> we are seeing OOM/stalls messages when we run ltp cpuset01(cpuset01 -I
> 360) test for few minutes, even through the numa system has adequate
> memory on both nodes.
>
> this we have observed same on both arm64/thunderx numa and on x86 numa system!
>
> using latest ltp from master branch version 20160920-197-gbc4d3db
> and linux kernel version 4.9
>
> is this known bug already?
Probably not.
Is it possible that cpuset limits the process to one node, and numa
mempolicy to the other node?
> below is the oops log:
> [ 2280.275193] cgroup: new mount options do not match the existing
> superblock, will be ignored
> [ 2316.565940] cgroup: new mount options do not match the existing
> superblock, will be ignored
> [ 2393.388361] cpuset01: page allocation stalls for 10051ms, order:0,
> mode:0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO)
> [ 2393.388371] CPU: 9 PID: 18188 Comm: cpuset01 Not tainted 4.9.0 #1
> [ 2393.388373] Hardware name: Dell Inc. PowerEdge T630/0W9WXC, BIOS
> 1.0.4 08/29/2014
> [ 2393.388374] ffffc9000c1afba8 ffffffff813c771e ffffffff81a40be8
> 0000000000000001
> [ 2393.388377] ffffc9000c1afc30 ffffffff811b8c9a 024280ca00000202
> ffffffff81a40be8
> [ 2393.388380] ffffc9000c1afbd0 0000000000000010 ffffc9000c1afc40
> ffffc9000c1afbf0
> [ 2393.388383] Call Trace:
> [ 2393.388392] [<ffffffff813c771e>] dump_stack+0x63/0x85
> [ 2393.388397] [<ffffffff811b8c9a>] warn_alloc+0x13a/0x170
> [ 2393.388399] [<ffffffff811b95c4>] __alloc_pages_slowpath+0x884/0xac0
> [ 2393.388402] [<ffffffff811b9ac5>] __alloc_pages_nodemask+0x2c5/0x310
> [ 2393.388405] [<ffffffff8120f663>] alloc_pages_vma+0xb3/0x260
> [ 2393.388410] [<ffffffff811e0534>] ? anon_vma_interval_tree_insert+0x84/0x90
> [ 2393.388413] [<ffffffff811ea42c>] handle_mm_fault+0x129c/0x1550
> [ 2393.388417] [<ffffffff813d65bb>] ? call_rwsem_wake+0x1b/0x30
> [ 2393.388422] [<ffffffff8106a362>] __do_page_fault+0x222/0x4b0
> [ 2393.388424] [<ffffffff8106a61f>] do_page_fault+0x2f/0x80
> [ 2393.388429] [<ffffffff817ca588>] page_fault+0x28/0x30
> [ 2393.388431] Mem-Info:
> [ 2393.388437] active_anon:92316 inactive_anon:21059 isolated_anon:32
> active_file:202031 inactive_file:137088 isolated_file:0
> unevictable:16 dirty:20 writeback:5883 unstable:0
> slab_reclaimable:40274 slab_unreclaimable:21605
> mapped:26819 shmem:28393 pagetables:11375 bounce:0
> free:5494728 free_pcp:549 free_cma:0
> [ 2393.388446] Node 0 active_anon:310368kB inactive_anon:25684kB
> active_file:807836kB inactive_file:548592kB unevictable:60kB
> isolated(anon):0kB isolated(file):0kB mapped:101672kB dirty:80kB
> writeback:148kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB
> anon_thp: 25780kB writeback_tmp:0kB unstable:0kB pages_scanned:0
> all_unreclaimable? no
> [ 2393.388455] Node 1 active_anon:58896kB inactive_anon:58552kB
> active_file:288kB inactive_file:0kB unevictable:4kB
> isolated(anon):128kB isolated(file):0kB mapped:5604kB dirty:0kB
> writeback:23384kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB
> anon_thp: 87792kB writeback_tmp:0kB unstable:0kB pages_scanned:0
> all_unreclaimable? no
> [ 2393.388457] Node 1 Normal free:11937124kB min:45532kB low:62044kB
> high:78556kB active_anon:58896kB inactive_anon:58552kB
> active_file:288kB inactive_file:0kB unevictable:4kB
> writepending:23384kB present:16777216kB managed:16512808kB mlocked:4kB
> slab_reclaimable:37876kB slab_unreclaimable:44812kB
> kernel_stack:4264kB pagetables:27612kB bounce:0kB free_pcp:2240kB
> local_pcp:0kB free_cma:0kB
> [ 2393.388462] lowmem_reserve[]: 0 0 0 0
> [ 2393.388465] Node 1 Normal: 1179*4kB (UME) 1396*8kB (UME) 1193*16kB
> (UME) 910*32kB (UME) 721*64kB (UME) 568*128kB (UME) 444*256kB (UME)
> 328*512kB (ME) 223*1024kB (UM) 138*2048kB (ME) 2676*4096kB (M) =
> 11936412kB
> [ 2393.388479] Node 0 hugepages_total=4 hugepages_free=4
> hugepages_surp=0 hugepages_size=1048576kB
> [ 2393.388481] Node 1 hugepages_total=4 hugepages_free=4
> hugepages_surp=0 hugepages_size=1048576kB
> [ 2393.388481] 374277 total pagecache pages
> [ 2393.388483] 6667 pages in swap cache
> [ 2393.388484] Swap cache stats: add 101786, delete 95119, find 393/682
> [ 2393.388485] Free swap = 15979384kB
> [ 2393.388485] Total swap = 16383996kB
> [ 2393.388486] 8331071 pages RAM
> [ 2393.388486] 0 pages HighMem/MovableOnly
> [ 2393.388487] 152036 pages reserved
> [ 2393.388487] 0 pages hwpoisoned
> [ 2397.331098] cpuset01 invoked oom-killer:
> gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=1,
> order=0, oom_score_adj=0
>
>
> [gkulkarni@xeon-numa ltp]$ numactl --hardware
> available: 2 nodes (0-1)
> node 0 cpus: 0 2 4 6 8 10 12 14 16 18 20 22
> node 0 size: 15823 MB
> node 0 free: 10211 MB
> node 1 cpus: 1 3 5 7 9 11 13 15 17 19 21 23
> node 1 size: 16125 MB
> node 1 free: 11628 MB
> node distances:
> node 0 1
> 0: 10 21
> 1: 21 10
>
>
> thanks
> Ganapat
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>
>