Re: [PATCH] mm: always consider THP when adjusting min_free_kbytes

From: Mike Kravetz
Date: Tue Feb 04 2020 - 16:44:00 EST


On 2/4/20 12:33 PM, David Rientjes wrote:
> On Tue, 4 Feb 2020, Mike Kravetz wrote:
>
> Hmm, if khugepaged_adjust_min_free_kbytes() increases min_free_kbytes for
> thp, then the user has no ability to override this increase by using
> vm.min_free_kbytes?
>
> IIUC, with this change, it looks like memory hotplug events properly
> increase min_free_kbytes for thp optimization but also doesn't respect a
> previous user-defined value?

Good catch.

We should only call khugepaged_adjust_min_free_kbytes from the 'true'
block of this if statement in init_per_zone_wmark_min.

if (new_min_free_kbytes > user_min_free_kbytes) {
min_free_kbytes = new_min_free_kbytes;
if (min_free_kbytes < 128)
min_free_kbytes = 128;
if (min_free_kbytes > 65536)
min_free_kbytes = 65536;
} else {
pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
new_min_free_kbytes, user_min_free_kbytes);
}

In the existing code, a hotplug event will cause min_free_kbytes to overwrite
the user defined value if the new value is greater. However, you will get
the warning message if the user defined value is greater. I am not sure if
this is the 'desired/expected' behavior? We print a warning if the user value
takes precedence over our calculated value. However, we do not print a message
if we overwrite the user defined value. That doesn't seem right!

> So it looks like this is fixing an obvious correctness issue but also now
> requires users to rewrite the sysctl if they want to decrease the min
> watermark.

Moving the call to khugepaged_adjust_min_free_kbytes as described above
would avoid the THP adjustment unless we were going to overwrite the
user defined value. Now, I am not sure overwriting the user defined value
as is done today is actually the correct thing to do.

Thoughts?
Perhaps we should never overwrite a user defined value?
--
Mike Kravetz