Re: [PATCH] mm: always consider THP when adjusting min_free_kbytes

From: Matthew Wilcox
Date: Tue Feb 04 2020 - 16:53:26 EST


On Tue, Feb 04, 2020 at 01:42:43PM -0800, Mike Kravetz wrote:
> On 2/4/20 12:33 PM, David Rientjes wrote:
> > On Tue, 4 Feb 2020, Mike Kravetz wrote:
> >
> > Hmm, if khugepaged_adjust_min_free_kbytes() increases min_free_kbytes for
> > thp, then the user has no ability to override this increase by using
> > vm.min_free_kbytes?
> >
> > IIUC, with this change, it looks like memory hotplug events properly
> > increase min_free_kbytes for thp optimization but also doesn't respect a
> > previous user-defined value?
>
> Good catch.
>
> We should only call khugepaged_adjust_min_free_kbytes from the 'true'
> block of this if statement in init_per_zone_wmark_min.
>
> if (new_min_free_kbytes > user_min_free_kbytes) {
> min_free_kbytes = new_min_free_kbytes;
> if (min_free_kbytes < 128)
> min_free_kbytes = 128;
> if (min_free_kbytes > 65536)
> min_free_kbytes = 65536;
> } else {
> pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
> new_min_free_kbytes, user_min_free_kbytes);
> }
>
> In the existing code, a hotplug event will cause min_free_kbytes to overwrite
> the user defined value if the new value is greater. However, you will get
> the warning message if the user defined value is greater. I am not sure if
> this is the 'desired/expected' behavior? We print a warning if the user value
> takes precedence over our calculated value. However, we do not print a message
> if we overwrite the user defined value. That doesn't seem right!
>
> > So it looks like this is fixing an obvious correctness issue but also now
> > requires users to rewrite the sysctl if they want to decrease the min
> > watermark.
>
> Moving the call to khugepaged_adjust_min_free_kbytes as described above
> would avoid the THP adjustment unless we were going to overwrite the
> user defined value. Now, I am not sure overwriting the user defined value
> as is done today is actually the correct thing to do.
>
> Thoughts?
> Perhaps we should never overwrite a user defined value?

We should certainly warn if we would have adjusted it, had they not
changed it!

I'm reluctant to suggest we do a more complex adjustment of the value
(eg figure out what the adjustment would have been, then apply some
fraction of that adjustment to keep the ratios in proportion) because
we don't really know why they adjusted it.

OTOH, we should adjust it if the user-set min_free_kbytes is now too
large for the amount of memory now in the machine.