Re: question: should_compact_retry limit

From: Mike Kravetz
Date: Wed Jun 05 2019 - 12:09:37 EST


On 6/5/19 12:58 AM, Vlastimil Babka wrote:
> On 6/5/19 1:30 AM, Mike Kravetz wrote:
>> While looking at some really long hugetlb page allocation times, I noticed
>> instances where should_compact_retry() was returning true more often that
>> I expected. In one allocation attempt, it returned true 765668 times in a
>> row. To me, this was unexpected because of the following:
>>
>> #define MAX_COMPACT_RETRIES 16
>> int max_retries = MAX_COMPACT_RETRIES;
>>
>> However, if should_compact_retry() returns true via the following path we
>> do not increase the retry count.
>>
>> /*
>> * make sure the compaction wasn't deferred or didn't bail out early
>> * due to locks contention before we declare that we should give up.
>> * But do not retry if the given zonelist is not suitable for
>> * compaction.
>> */
>> if (compaction_withdrawn(compact_result)) {
>> ret = compaction_zonelist_suitable(ac, order, alloc_flags);
>> goto out;
>> }
>>
>> Just curious, is this intentional?
>
> Hmm I guess we didn't expect compaction_withdrawn() to be so
> consistently returned. Do you know what value of compact_result is there
> in your test?

Added some instrumentation to record values and ran test,

557904 Total

549186 COMPACT_DEFERRED
8718 COMPACT_PARTIAL_SKIPPED

Do note that this is not my biggest problem with these allocations. That is
should_continue_reclaim returning true more often that in should. Still
trying to get more info on that. This was just something curious I also
discovered.
--
Mike Kravetz