Re: [PATCH] mm: always consider THP when adjusting min_free_kbytes
From: Mike Kravetz
Date: Tue Feb 04 2020 - 19:35:14 EST
On 2/4/20 1:53 PM, Matthew Wilcox wrote:
> On Tue, Feb 04, 2020 at 01:42:43PM -0800, Mike Kravetz wrote:
>> On 2/4/20 12:33 PM, David Rientjes wrote:
>>> On Tue, 4 Feb 2020, Mike Kravetz wrote:
>>>
>>> Hmm, if khugepaged_adjust_min_free_kbytes() increases min_free_kbytes for
>>> thp, then the user has no ability to override this increase by using
>>> vm.min_free_kbytes?
>>>
>>> IIUC, with this change, it looks like memory hotplug events properly
>>> increase min_free_kbytes for thp optimization but also doesn't respect a
>>> previous user-defined value?
>>
>> Good catch.
>>
>> We should only call khugepaged_adjust_min_free_kbytes from the 'true'
>> block of this if statement in init_per_zone_wmark_min.
>>
>> if (new_min_free_kbytes > user_min_free_kbytes) {
>> min_free_kbytes = new_min_free_kbytes;
>> if (min_free_kbytes < 128)
>> min_free_kbytes = 128;
>> if (min_free_kbytes > 65536)
>> min_free_kbytes = 65536;
>> } else {
>> pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
>> new_min_free_kbytes, user_min_free_kbytes);
>> }
>>
>> In the existing code, a hotplug event will cause min_free_kbytes to overwrite
>> the user defined value if the new value is greater. However, you will get
>> the warning message if the user defined value is greater. I am not sure if
>> this is the 'desired/expected' behavior? We print a warning if the user value
>> takes precedence over our calculated value. However, we do not print a message
>> if we overwrite the user defined value. That doesn't seem right!
>>
>>> So it looks like this is fixing an obvious correctness issue but also now
>>> requires users to rewrite the sysctl if they want to decrease the min
>>> watermark.
>>
>> Moving the call to khugepaged_adjust_min_free_kbytes as described above
>> would avoid the THP adjustment unless we were going to overwrite the
>> user defined value. Now, I am not sure overwriting the user defined value
>> as is done today is actually the correct thing to do.
>>
>> Thoughts?
>> Perhaps we should never overwrite a user defined value?
>
> We should certainly warn if we would have adjusted it, had they not
> changed it!
Ok, the code above does that today.
> I'm reluctant to suggest we do a more complex adjustment of the value
> (eg figure out what the adjustment would have been, then apply some
> fraction of that adjustment to keep the ratios in proportion) because
> we don't really know why they adjusted it.
Agree!
> OTOH, we should adjust it if the user-set min_free_kbytes is now too
> large for the amount of memory now in the machine.
Today, we never overwrite a user defined value that is larger than
that calculated by the code. However, we will owerwrite a user defined
value if the code calculates a larger value.
I'm starting to think the best option is to NEVER overwrite a user defined
value.
--
Mike Kravetz