Re: [PATCH v5 3/4] mm/memory-failure: skip take_page_off_buddy after dissolving HWPoison HugeTLB page

From: Jiaqi Yan

Date: Sun Jun 14 2026 - 20:17:10 EST

On Tue, Jun 9, 2026 at 12:21 AM Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote:
>
> On 2026/5/31 13:58, Jiaqi Yan wrote:
> > Now that HWPoison subpage(s) within HugeTLB page will be rejected by
> > buddy allocator during dissolve_free_hugetlb_folio(), there is no
> > need to drain_all_pages() and take_page_off_buddy() anymore. In fact,
> > calling take_page_off_buddy() after dissolve_free_hugetlb_folio()
> > succeeded returns false, making caller think __page_handle_poison()
> > failed.
> >
> > Add __hugepage_handle_poison() and replace __page_handle_poison() at
> > HugeTLB specific call sites. The being handled HugeTLB page either
> > is free at the moment of try_memory_failure_hugetlb(), or becomes
> > free at the moment of me_huge_page().
> >
> > Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
> > ---
> > mm/memory-failure.c | 36 ++++++++++++++++++++++++++++++------
> > 1 file changed, 30 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 95979b7995c1..098c4407e818 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -163,6 +163,30 @@ static struct rb_root_cached pfn_space_itree = RB_ROOT_CACHED;
> > static DEFINE_MUTEX(pfn_space_lock);
> >
> > /*
> > + * Only for a HugeTLB page being handled by memory_failure(). The key
> > + * difference to soft_offline() is that, no HWPoison subpage will make
> > + * into buddy allocator after a successful dissolve_free_hugetlb_folio(),
> > + * so take_page_off_buddy() is unnecessary.
> > + */
> > +static int __hugepage_handle_poison(struct page *page)
> > +{
> > + struct folio *folio = page_folio(page);
> > +
> > + VM_WARN_ON_FOLIO(!folio_test_hwpoison(folio), folio);
> > +
> > + /*
> > + * Can't use dissolve_free_hugetlb_folio() without a reliable
> > + * raw_hwp_list telling which subpage is HWPoison.
>
> This reminds me that hugetlb_raw_hwp_unreliable folios can be freed into buddy yet?
> Should we handle them too?

I think we can (very likely already?) handle such
hugetlb_raw_hwp_unreliable folios by "leaking" them: neither freeing
them to the buddy allocator nor allocating them in
dequeue_hugetlb_folio_node_exact().

>
> > + */
> > + if (folio_test_hugetlb_raw_hwp_unreliable(folio))
> > + /* raw_hwp_list becomes unreliable when kmalloc() fails. */
> > + return -ENOMEM;
>
> If folios have hugetlb_raw_hwp_unreliable set, hugetlb_update_hwpoison will return
> MF_HUGETLB_FOLIO_PRE_POISONED thus these folios cannot reach here, e.g. me_huge_page.
> The only way these folios can reach here is that they are hwpoisoned first time so
> hugetlb_update_hwpoison returns 0 even if hugetlb_raw_hwp_unreliable is set at the same

For first time HWPoison-ed hugetlb folio, hugetlb_update_hwpoison()
still returns rc=0 when kmalloc_obj() fails. So
get_huge_page_for_hwpoison() won't override ret with rc. IOW, ret will
be MF_HUGETLB_IN_USED or MF_HUGETLB_FREED. I believe
MF_HUGETLB_IN_USED can still get into me_huge_page(). MF_HUGETLB_FREED
goes directly to __hugepage_handle_poison().

So my intent is to block both places from dissolve_free_hugetlb_folio().

> time. In that case, we can simply dissolve the hugetlb folios and then take the sole hwpoisoned
> @page off buddy? But this might not be a good idea as it is really fragile...

I think it is fragile too, and would prefer the leaking approach.
What's your thoughts?

>
> Thanks.
> .