[PATCH] mm, hugetlb: Avoid passing a null nodemask when there is mbind policy

From: Oscar Salvador
Date: Tue Apr 15 2025 - 08:24:26 EST


Before trying to allocate a page, gather_surplus_pages() sets up a nodemask
for the nodes we can allocate from, but instead of passing the nodemask
down the road to the page allocator, it iterates over the nodes within that
nodemask right there, meaning that the page allocator will receive a preferred_nid
and a null nodemask.

This is a problem when using a memory policy, because it might be that
the page allocator ends up using a node as a fallback which is not
represented in the policy.

Avoid that by passing the nodemask directly to the page allocator, so it can
filter out fallback nodes that are not part of the nodemask.

Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
---
mm/hugetlb.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ccc4f08f8481..5e1cba0f835f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2419,7 +2419,6 @@ static int gather_surplus_pages(struct hstate *h, long delta)
long i;
long needed, allocated;
bool alloc_ok = true;
- int node;
nodemask_t *mbind_nodemask, alloc_nodemask;

mbind_nodemask = policy_mbind_nodemask(htlb_alloc_mask(h));
@@ -2443,21 +2442,12 @@ static int gather_surplus_pages(struct hstate *h, long delta)
for (i = 0; i < needed; i++) {
folio = NULL;

- /* Prioritize current node */
- if (node_isset(numa_mem_id(), alloc_nodemask))
- folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),
- numa_mem_id(), NULL);
-
- if (!folio) {
- for_each_node_mask(node, alloc_nodemask) {
- if (node == numa_mem_id())
- continue;
- folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),
- node, NULL);
- if (folio)
- break;
- }
- }
+ /*
+ * It is okay to use NUMA_NO_NODE because we use numa_mem_id()
+ * down the road to pick the current node if that is the case.
+ */
+ folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h),
+ NUMA_NO_NODE, &alloc_nodemask);
if (!folio) {
alloc_ok = false;
break;
--
2.49.0