[PATCH] hugetlb: optimize race error return in alloc_surplus_huge_page

From: Mike Kravetz
Date: Tue Aug 11 2020 - 15:45:41 EST


The routine alloc_surplus_huge_page() could race with with a pool
size change. If this happens, the allocated page may not be needed.
To free the page, the current code will 'Abuse temporary page to
workaround the nasty free_huge_page codeflow'. Instead, directly
call the low level routine that free_huge_page uses. This works
out well because the page is new, we hold the only reference and
already hold the hugetlb_lock.

Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
---
mm/hugetlb.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 590111ea6975..ac89b91fba86 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1923,14 +1923,17 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
/*
* We could have raced with the pool size change.
* Double check that and simply deallocate the new page
- * if we would end up overcommiting the surpluses. Abuse
- * temporary page to workaround the nasty free_huge_page
- * codeflow
+ * if we would end up overcommiting the surpluses.
*/
if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
- SetPageHugeTemporary(page);
+ /*
+ * Since this page is new, we hold the only reference, and
+ * we already hold the hugetlb_lock call the low level free
+ * page routine. This saves at least a lock roundtrip.
+ */
+ (void)put_page_testzero(page); /* don't call destructor */
+ update_and_free_page(h, page);
spin_unlock(&hugetlb_lock);
- put_page(page);
return NULL;
} else {
h->surplus_huge_pages++;
--
2.25.4


Here is a description of the difference in "Temporary Page" vs "Surplus
Page" approach.

Both only allocate a fresh huge page if surplus_huge_pages is less than
nr_overcommit_huge_pages. Of course, the lock protecting those counts
must be dropped to perform the allocation. After reacquiring the lock
is where we have the proposed difference in behavior.

temporary page behavior
-----------------------
if surplus_huge_pages >= h->nr_overcommit_huge_pages
SetPageHugeTemporary(page)
spin_unlock(&hugetlb_lock);
put_page(page);

At this time we know surplus_huge_pages is 'at least' nr_overcommit_huge_pages.
As a result, any new allocation will fail.
Only user visible result is that number of huge pages will be one greater than
that specified by user and overcommit values. This is only visible for the
short time until the page is actully freed as a result of put_page().

free_huge_page()
number of huge pages will be decremented

suprlus page behavior
---------------------
surplus_huge_pages++
surplus_huge_pages_node[page_to_nid(page)]++
if surplus_huge_pages > nr_overcommit_huge_pages
spin_unlock(&hugetlb_lock);
put_page(page);

At this time we know surplus_huge_pages is greater than
nr_overcommit_huge_pages. As a result, any new allocation will fail.
User visible result is an increase in surplus pages as well as number of
huge pages. In addition, surplus pages will exceed overcommit. This is
only visible for the short time until the page is actully freed as a
result of put_page().

free_huge_page()
number of huge pages will be decremented
h->surplus_huge_pages--;
h->surplus_huge_pages_node[nid]--;