Re: [PATCH v5 3/4] mm/memory-failure: skip take_page_off_buddy after dissolving HWPoison HugeTLB page
From: Miaohe Lin
Date: Tue Jun 09 2026 - 03:25:33 EST
On 2026/5/31 13:58, Jiaqi Yan wrote:
> Now that HWPoison subpage(s) within HugeTLB page will be rejected by
> buddy allocator during dissolve_free_hugetlb_folio(), there is no
> need to drain_all_pages() and take_page_off_buddy() anymore. In fact,
> calling take_page_off_buddy() after dissolve_free_hugetlb_folio()
> succeeded returns false, making caller think __page_handle_poison()
> failed.
>
> Add __hugepage_handle_poison() and replace __page_handle_poison() at
> HugeTLB specific call sites. The being handled HugeTLB page either
> is free at the moment of try_memory_failure_hugetlb(), or becomes
> free at the moment of me_huge_page().
>
> Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
> ---
> mm/memory-failure.c | 36 ++++++++++++++++++++++++++++++------
> 1 file changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 95979b7995c1..098c4407e818 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -163,6 +163,30 @@ static struct rb_root_cached pfn_space_itree = RB_ROOT_CACHED;
> static DEFINE_MUTEX(pfn_space_lock);
>
> /*
> + * Only for a HugeTLB page being handled by memory_failure(). The key
> + * difference to soft_offline() is that, no HWPoison subpage will make
> + * into buddy allocator after a successful dissolve_free_hugetlb_folio(),
> + * so take_page_off_buddy() is unnecessary.
> + */
> +static int __hugepage_handle_poison(struct page *page)
> +{
> + struct folio *folio = page_folio(page);
> +
> + VM_WARN_ON_FOLIO(!folio_test_hwpoison(folio), folio);
> +
> + /*
> + * Can't use dissolve_free_hugetlb_folio() without a reliable
> + * raw_hwp_list telling which subpage is HWPoison.
This reminds me that hugetlb_raw_hwp_unreliable folios can be freed into buddy yet?
Should we handle them too?
> + */
> + if (folio_test_hugetlb_raw_hwp_unreliable(folio))
> + /* raw_hwp_list becomes unreliable when kmalloc() fails. */
> + return -ENOMEM;
If folios have hugetlb_raw_hwp_unreliable set, hugetlb_update_hwpoison will return
MF_HUGETLB_FOLIO_PRE_POISONED thus these folios cannot reach here, e.g. me_huge_page.
The only way these folios can reach here is that they are hwpoisoned first time so
hugetlb_update_hwpoison returns 0 even if hugetlb_raw_hwp_unreliable is set at the same
time. In that case, we can simply dissolve the hugetlb folios and then take the sole hwpoisoned
@page off buddy? But this might not be a good idea as it is really fragile...
Thanks.
.