Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

From: Mike Kravetz
Date: Thu Oct 13 2016 - 19:27:17 EST


On 10/13/2016 08:24 AM, Mike Kravetz wrote:
> On 10/13/2016 05:19 AM, Jan Stancek wrote:
>> Hi,
>>
>> I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on
>> a power8 lpar system running 4.8 or latest git [2]. Repeated runs of
>> this suite trigger multiple OOMs, that eventually kill entire system,
>> it usually takes 3-5 runs:
>>
>> * Total System Memory......: 18024 MB
>> * Shared Mem Max Mapping...: 320 MB
>> * System Huge Page Size....: 16 MB
>> * Available Huge Pages.....: 20
>> * Total size of Huge Pages.: 320 MB
>> * Remaining System Memory..: 17704 MB
>> * Huge Page User Group.....: hugepages (1001)
>>

Hi Jan,

Any chance you can get the contents of /sys/kernel/mm/hugepages
before and after the first run of libhugetlbfs testsuite on Power?
Perhaps a script like:

cd /sys/kernel/mm/hugepages
for f in hugepages-*/*; do
n=`cat $f`;
echo -e "$n\t$f";
done

Just want to make sure the numbers look as they should.

--
Mike Kravetz