Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system
From: Aneesh Kumar K.V
Date: Mon Oct 17 2016 - 04:42:38 EST
Mike Kravetz <mike.kravetz@xxxxxxxxxx> writes:
> On 10/14/2016 01:48 AM, Jan Stancek wrote:
>> On 10/14/2016 01:26 AM, Mike Kravetz wrote:
>>>
>>> Hi Jan,
>>>
>>> Any chance you can get the contents of /sys/kernel/mm/hugepages
>>> before and after the first run of libhugetlbfs testsuite on Power?
>>> Perhaps a script like:
>>>
>>> cd /sys/kernel/mm/hugepages
>>> for f in hugepages-*/*; do
>>> n=`cat $f`;
>>> echo -e "$n\t$f";
>>> done
>>>
>>> Just want to make sure the numbers look as they should.
>>>
>>
>> Hi Mike,
>>
>> Numbers are below. I have also isolated a single testcase from "func"
>> group of tests: corrupt-by-cow-opt [1]. This test stops working if I
>> run it 19 times (with 20 hugepages). And if I disable this test,
>> "func" group tests can all pass repeatedly.
>
> Thanks Jan,
>
> I appreciate your efforts.
>
>>
>> [1] https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/corrupt-by-cow-opt.c
>>
>> Regards,
>> Jan
>>
>> Kernel is v4.8-14230-gb67be92, with reboot between each run.
>> 1) Only func tests
>> System boot
>> After setup:
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 0 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>> After func tests:
>> ********** TEST SUMMARY
>> * 16M
>> * 32-bit 64-bit
>> * Total testcases: 0 85
>> * Skipped: 0 0
>> * PASS: 0 81
>> * FAIL: 0 4
>> * Killed by signal: 0 0
>> * Bad configuration: 0 0
>> * Expected FAIL: 0 0
>> * Unexpected PASS: 0 0
>> * Strange test result: 0 0
>>
>> 26 hugepages-16384kB/free_hugepages
>> 26 hugepages-16384kB/nr_hugepages
>> 26 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 1 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>> After test cleanup:
>> umount -a -t hugetlbfs
>> hugeadm --pool-pages-max ${HPSIZE}:0
>>
>> 1 hugepages-16384kB/free_hugepages
>> 1 hugepages-16384kB/nr_hugepages
>> 1 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 1 hugepages-16384kB/resv_hugepages
>> 1 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>
> I am guessing the leaked reserve page is which is triggered by
> running the test you isolated corrupt-by-cow-opt.
>
>
>> ---
>>
>> 2) Only stress tests
>> System boot
>> After setup:
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 0 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>> After stress tests:
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 17 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>> After cleanup:
>> 17 hugepages-16384kB/free_hugepages
>> 17 hugepages-16384kB/nr_hugepages
>> 17 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 17 hugepages-16384kB/resv_hugepages
>> 17 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>
> This looks worse than the summary after running the functional tests.
>
>> ---
>>
>> 3) only corrupt-by-cow-opt
>>
>> System boot
>> After setup:
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 0 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>>
>> libhugetlbfs-2.18# env LD_LIBRARY_PATH=./obj64 ./tests/obj64/corrupt-by-cow-opt; /root/grab.sh
>> Starting testcase "./tests/obj64/corrupt-by-cow-opt", pid 3298
>> Write s to 0x3effff000000 via shared mapping
>> Write p to 0x3effff000000 via private mapping
>> Read s from 0x3effff000000 via shared mapping
>> PASS
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 1 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>
> Leaked one reserve page
>
>>
>> # env LD_LIBRARY_PATH=./obj64 ./tests/obj64/corrupt-by-cow-opt; /root/grab.sh
>> Starting testcase "./tests/obj64/corrupt-by-cow-opt", pid 3312
>> Write s to 0x3effff000000 via shared mapping
>> Write p to 0x3effff000000 via private mapping
>> Read s from 0x3effff000000 via shared mapping
>> PASS
>> 20 hugepages-16384kB/free_hugepages
>> 20 hugepages-16384kB/nr_hugepages
>> 20 hugepages-16384kB/nr_hugepages_mempolicy
>> 0 hugepages-16384kB/nr_overcommit_hugepages
>> 2 hugepages-16384kB/resv_hugepages
>> 0 hugepages-16384kB/surplus_hugepages
>> 0 hugepages-16777216kB/free_hugepages
>> 0 hugepages-16777216kB/nr_hugepages
>> 0 hugepages-16777216kB/nr_hugepages_mempolicy
>> 0 hugepages-16777216kB/nr_overcommit_hugepages
>> 0 hugepages-16777216kB/resv_hugepages
>> 0 hugepages-16777216kB/surplus_hugepages
>
> It is pretty consistent that we leak a reserve page every time this
> test is run.
>
> The interesting thing is that corrupt-by-cow-opt is a very simple
> test case. commit 67961f9db8c4 potentially changes the return value
> of the functions vma_has_reserves() and vma_needs/commit_reservation()
> for the owner (HPAGE_RESV_OWNER) of private mappings. running the
> test with and without the commit results in the same return values for
> these routines on x86. And, no leaked reserve pages.
looking at that commit, I am not sure region_chg output indicate a hole
punched. ie, w.r.t private mapping when we mmap, we don't do a
region_chg (hugetlb_reserve_page()). So with a fault later when we
call vma_needs_reservation, we will find region_chg returning >= 0 right ?
>
> Is it possible to revert this commit and run the libhugetlbs tests
> (func and stress) again while monitoring the counts in /sys? The
> counts should go to zero after cleanup as you describe above. I just
> want to make sure that this commit is causing all the problems you
> are seeing. If it is, then we can consider reverting and I can try
> to think of another way to address the original issue.
>
> Thanks for your efforts on this. I can not reproduce on x86 or sparc
> and do not see any similar symptoms on these architectures.
>
Not sure how any of this is arch specific. So on both x86 and sparc we
don't find the count going wrong as above ?
-aneesh