Re: [PATCH 3/8] hugetlb: add per-hstate mutex to synchronize user adjustments

From: Oscar Salvador
Date: Thu Mar 25 2021 - 08:30:42 EST


On Wed, Mar 24, 2021 at 05:28:30PM -0700, Mike Kravetz wrote:
> The helper routine hstate_next_node_to_alloc accesses and modifies the
> hstate variable next_nid_to_alloc. The helper is used by the routines
> alloc_pool_huge_page and adjust_pool_surplus. adjust_pool_surplus is
> called with hugetlb_lock held. However, alloc_pool_huge_page can not
> be called with the hugetlb lock held as it will call the page allocator.
> Two instances of alloc_pool_huge_page could be run in parallel or
> alloc_pool_huge_page could run in parallel with adjust_pool_surplus
> which may result in the variable next_nid_to_alloc becoming invalid
> for the caller and pages being allocated on the wrong node.

Is this something you have seen happening? If so, it is easier to
trigger? I doubt so as I have not seen any bug report, but just
wondering whether a Fixes tag is needed, or probably not worth, right?

> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -566,6 +566,7 @@ HPAGEFLAG(Freed, freed)
> #define HSTATE_NAME_LEN 32
> /* Defines one hugetlb page size */
> struct hstate {
> + struct mutex mutex;

I am also with Michal here, renaming the mutex to something closer to
its function might be better to understand it without diving too much in
the code.

> int next_nid_to_alloc;
> int next_nid_to_free;
> unsigned int order;
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f9ba63fc1747..404b0b1c5258 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2616,6 +2616,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> else
> return -ENOMEM;
>
> + /* mutex prevents concurrent adjustments for the same hstate */
> + mutex_lock(&h->mutex);
> spin_lock(&hugetlb_lock);

I find above comment a bit misleading.
AFAIK, hugetlb_lock also protects from concurrent adjustments for the
same hstate (hugepage_activelist, free_huge_pages, surplus_huge_pages,
etc...).
Would it be more apropiate saying that mutex_lock() only prevents from
simultaneously sysfs/proc operations?

Reviewed-by: Oscar Salvador <osalvador@suse.e>


--
Oscar Salvador
SUSE L3