Re: [PATCH] mm/hugetlb.c: fix reservation race when freeing surplus pages

From: Michal Hocko
Date: Tue Jan 10 2017 - 04:34:42 EST


On Mon 09-01-17 11:56:07, Mike Kravetz wrote:
> The routine return_unused_surplus_pages decrements the global
> reservation count, and frees any unused surplus pages that were
> backing the reservation. Commit 7848a4bf51b3 ("mm/hugetlb.c:
> add cond_resched_lock() in return_unused_surplus_pages()") added
> a call to cond_resched_lock in the loop freeing the pages. As
> a result, the hugetlb_lock could be dropped, and someone else
> could use the pages that will be freed in subsequent iterations
> of the loop. This could result in inconsistent global hugetlb
> page state, application api failures (such as mmap) failures or
> application crashes.
>
> When dropping the lock in return_unused_surplus_pages, make sure
> that the global reservation count (resv_huge_pages) remains
> sufficiently large to prevent someone else from claiming pages
> about to be freed.
>
> Fixes: 7848a4bf51b3 ("mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()")
> Reported-and-analyzed-by: Paul Cassella <cassella@xxxxxxxx>
> Suggested-by: Michal Hocko <mhocko@xxxxxxxxxx>
> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>

Looks good to me. I think we want also
Cc: stable # 3.15+

Paul, your Tested-by would be more than appreciated.

Thanks Mike!

> ---
> mm/hugetlb.c | 37 ++++++++++++++++++++++++++++---------
> 1 file changed, 28 insertions(+), 9 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 418bf01..a1760fa 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1773,23 +1773,32 @@ static int gather_surplus_pages(struct hstate *h, int delta)
> }
>
> /*
> - * When releasing a hugetlb pool reservation, any surplus pages that were
> - * allocated to satisfy the reservation must be explicitly freed if they were
> - * never used.
> - * Called with hugetlb_lock held.
> + * This routine has two main purposes:
> + * 1) Decrement the reservation count (resv_huge_pages) by the value passed
> + * in unused_resv_pages. This corresponds to the prior adjustments made
> + * to the associated reservation map.
> + * 2) Free any unused surplus pages that may have been allocated to satisfy
> + * the reservation. As many as unused_resv_pages may be freed.
> + *
> + * Called with hugetlb_lock held. However, the lock could be dropped (and
> + * reacquired) during calls to cond_resched_lock. Whenever dropping the lock,
> + * we must make sure nobody else can claim pages we are in the process of
> + * freeing. Do this by ensuring resv_huge_page always is greater than the
> + * number of huge pages we plan to free when dropping the lock.
> */
> static void return_unused_surplus_pages(struct hstate *h,
> unsigned long unused_resv_pages)
> {
> unsigned long nr_pages;
>
> - /* Uncommit the reservation */
> - h->resv_huge_pages -= unused_resv_pages;
> -
> /* Cannot return gigantic pages currently */
> if (hstate_is_gigantic(h))
> - return;
> + goto out;
>
> + /*
> + * Part (or even all) of the reservation could have been backed
> + * by pre-allocated pages. Only free surplus pages.
> + */
> nr_pages = min(unused_resv_pages, h->surplus_huge_pages);
>
> /*
> @@ -1799,12 +1808,22 @@ static void return_unused_surplus_pages(struct hstate *h,
> * when the nodes with surplus pages have no free pages.
> * free_pool_huge_page() will balance the the freed pages across the
> * on-line nodes with memory and will handle the hstate accounting.
> + *
> + * Note that we decrement resv_huge_pages as we free the pages. If
> + * we drop the lock, resv_huge_pages will still be sufficiently large
> + * to cover subsequent pages we may free.
> */
> while (nr_pages--) {
> + h->resv_huge_pages--;
> + unused_resv_pages--;
> if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
> - break;
> + goto out;
> cond_resched_lock(&hugetlb_lock);
> }
> +
> +out:
> + /* Fully uncommit the reservation */
> + h->resv_huge_pages -= unused_resv_pages;
> }
>
>
> --
> 2.7.4

--
Michal Hocko
SUSE Labs