Re: getting oom/stalls for ltp test cpuset01 with latest/4.9 kernel

From: Vlastimil Babka
Date: Thu Jan 12 2017 - 06:11:46 EST


On 01/11/2017 05:46 PM, Michal Hocko wrote:
On Wed 11-01-17 21:52:29, Ganapatrao Kulkarni wrote:

[ 2398.169391] Node 1 Normal: 951*4kB (UME) 1308*8kB (UME) 1034*16kB (UME) 742*32kB (UME) 581*64kB (UME) 450*128kB (UME) 362*256kB (UME) 275*512kB (ME) 189*1024kB (UM) 117*2048kB (ME) 2742*4096kB (M) = 12047196kB

Most of the memblocks are marked Unmovable (except for the 4MB bloks)

No, UME here means that e.g. 4kB blocks are available on unmovable, movable and reclaimable lists.

which shouldn't matter because we can fallback to unmovable blocks for
movable allocation AFAIR so we shouldn't really fail the request. I
really fail to see what is going on there but it smells really
suspicious.

Perhaps there's something wrong with zonelists and we are skipping the Node 1 Normal zone. Or there's some race with cpuset operations (but can't see how).

The question is, how reproducible is this? And what exactly the test cpuset01 does? Is it doing multiple things in a loop that could be reduced to a single testcase?