Re: [PATCH resend] mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison

From: Andrew Morton

Date: Fri May 22 2026 - 23:50:51 EST


On Fri, 22 May 2026 09:03:05 +0800 Wupeng Ma <mawupeng1@xxxxxxxxxx> wrote:

> Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page
> can trigger a recursive spinlock self-deadlock (AA deadlock) on
> hugetlb_lock when racing with a concurrent unmap:

Well we don't want that.

> Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()")

So I'll add cc:stable here.

AI review didn't like the unlocked page_folio():

https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@xxxxxxxxxx

So I'll add a followup patch which addresses that (and which addresses
Miaohe's naming nit).

Please let's check this - perhaps the locking alteration isn't needed.


From: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Subject: mm-memory-failure-fix-hugetlb_lock-aa-deadlock-in-get_huge_page_for_hwpoison-fix
Date: Fri May 22 08:44:25 PM PDT 2026

- address possible race identified by Sashiko

- s/out/out_unlock/, per Miaohe

Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@xxxxxxxxxx
Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@xxxxxxxxxx
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
Cc: Liam Howlett <liam.howlett@xxxxxxxxxx>
Cc: Lorenzo Stoakes <ljs@xxxxxxxxxx>
Cc: Miaohe Lin <linmiaohe@xxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Mike Rapoport <rppt@xxxxxxxxxx>
Cc: Muchun Song <muchun.song@xxxxxxxxx>
Cc: Naoya Horiguchi <nao.horiguchi@xxxxxxxxx>
Cc: Oscar Salvador (SUSE) <osalvador@xxxxxxxxxx>
Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
Cc: Vlastimil Babka <vbabka@xxxxxxxxxx>
Cc: Wupeng Ma <mawupeng1@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

mm/memory-failure.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/mm/memory-failure.c~mm-memory-failure-fix-hugetlb_lock-aa-deadlock-in-get_huge_page_for_hwpoison-fix
+++ a/mm/memory-failure.c
@@ -1970,14 +1970,15 @@ static int get_huge_page_for_hwpoison(un
bool *migratable_cleared)
{
struct page *page = pfn_to_page(pfn);
- struct folio *folio = page_folio(page);
+ struct folio *folio;
bool count_increased = false;
int ret, rc;

spin_lock_irq(&hugetlb_lock);
+ folio = page_folio(page);
if (!folio_test_hugetlb(folio)) {
ret = MF_HUGETLB_NON_HUGEPAGE;
- goto out;
+ goto out_unlock;
} else if (flags & MF_COUNT_INCREASED) {
ret = MF_HUGETLB_IN_USED;
count_increased = true;
@@ -1993,13 +1994,13 @@ static int get_huge_page_for_hwpoison(un
} else {
ret = MF_HUGETLB_RETRY;
if (!(flags & MF_NO_RETRY))
- goto out;
+ goto out_unlock;
}

rc = hugetlb_update_hwpoison(folio, page);
if (rc >= MF_HUGETLB_FOLIO_PRE_POISONED) {
ret = rc;
- goto out;
+ goto out_unlock;
}

/*
@@ -2013,7 +2014,7 @@ static int get_huge_page_for_hwpoison(un

spin_unlock_irq(&hugetlb_lock);
return ret;
-out:
+out_unlock:
spin_unlock_irq(&hugetlb_lock);
if (count_increased)
folio_put(folio);
_