Re: [External] Re: [PATCH v16 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

From: Oscar Salvador
Date: Wed Feb 24 2021 - 03:34:12 EST


On Wed, Feb 24, 2021 at 11:47:49AM +0800, Muchun Song wrote:
> I have been looking at the dequeue_huge_page_node_exact().
> If a PageHWPoison huge page is in the free pool list, the page will
> not be allocated to the user. The PageHWPoison huge page
> will be skip in the dequeue_huge_page_node_exact().

Yes, now I see where the problem lies.

hugetlb_no_page()->..->dequeue_huge_page_node_exact() will fail if the only
page in the pool is hwpoisoned, as expected.
Then alloc_buddy_huge_page_with_mpol() will be tried, but since surplus_huge_pages
counter is stale, we will fail there.
That relates to the problem Mike pointed out, that we should decrease again the
surplus_huge_pages.

I think hwpoisoned pages should not be in the free pool though.
Probably we want to take them off when we notice we have one:
e.g: dequeue_huge_page_node_exact could place the page in another list
and place it back in case it was unpoisoned.

But anyway, that has nothing to do with this (apart from the surplus problem).

--
Oscar Salvador
SUSE L3