[PATCH] mm/page_alloc: increase default min_free_kbytes bound

From: Joel Savitz
Date: Thu Feb 20 2020 - 10:01:37 EST



Currently, the vm.min_free_kbytes sysctl value is capped at a hardcoded
64M in init_per_zone_wmark_min (unless it is overridden by khugepaged
initialization).

This value has not been modified since 2005, and enterprise-grade
systems now frequently have hundreds of GB of RAM and multiple 10, 40,
or even 100 GB NICs. We have seen page allocation failures on heavily
loaded systems related to NIC drivers. These issues were resolved by an
increase to vm.min_free_kbytes.

This patch increases the hardcoded value by a factor of 4 as a temporary
solution.

Further work to make the calculation of vm.min_free_kbytes more
consistent throughout the kernel would be desirable.

As an example of the inconsistency of the current method, this value is
recalculated by init_per_zone_wmark_min() in the case of memory hotplug
which will override the value set by set_recommended_min_free_kbytes()
called during khugepaged initialization even if khugepaged remains
enabled, however an on/off toggle of khugepaged will then recalculate
and set the value via set_recommended_min_free_kbytes().

Signed-off-by: Joel Savitz <jsavitz@xxxxxxxxxx>
---
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3c4eb750a199..32cbfb13e958 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7867,8 +7867,8 @@ int __meminit init_per_zone_wmark_min(void)
min_free_kbytes = new_min_free_kbytes;
if (min_free_kbytes < 128)
min_free_kbytes = 128;
- if (min_free_kbytes > 65536)
- min_free_kbytes = 65536;
+ if (min_free_kbytes > 262144)
+ min_free_kbytes = 262144;
} else {
pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
new_min_free_kbytes, user_min_free_kbytes);
--
2.23.0