Re: [PATCH v8 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: David Hildenbrand (Arm)
Date: Mon Jun 01 2026 - 09:29:22 EST
On 6/1/26 14:28, Miaohe Lin wrote:
> On 2026/5/27 22:06, Breno Leitao wrote:
>> get_any_page() collapses every HWPoisonHandlable() rejection into a
>> single -EIO via the __get_hwpoison_page() -> -EBUSY -> shake_page()
>> -> retry path. That is correct for the transient case (a userspace
>> folio briefly off LRU during migration or compaction, which a later
>> shake can drag back), but wrong for stable kernel-owned pages: slab,
>> page-table, large-kmalloc and PG_reserved pages will never become
>> HWPoisonHandlable(), so the retry loop is wasted work and the final
>> -EIO loses the "this is structurally unrecoverable" information.
>> memory_failure() then maps -EIO into MF_MSG_GET_HWPOISON, which the
>> panic-on-unrecoverable sysctl deliberately does not act on.
>>
>> Introduce HWPoisonKernelOwned(), a small predicate that positively
>> identifies pages the hwpoison handler cannot recover from:
>>
>> HWPoisonKernelOwned(p, flags) :=
>> !(MF_SOFT_OFFLINE && page_has_movable_ops(p)) &&
>> (PageReserved(p) || PageSlab(p) ||
>> PageTable(p) || PageLargeKmalloc(p))
>>
>> The MF_SOFT_OFFLINE / page_has_movable_ops() opt-out mirrors the
>> same exception in HWPoisonHandlable(): soft-offline is allowed to
>> migrate movable_ops pages even though they are not on the LRU, and
>> we must not pre-empt that with an unrecoverable verdict.
>>
>> The list is intentionally not exhaustive. vmalloc and kernel-stack
>> pages, for example, do not carry a page_type bit and would need a
>> different oracle; they keep going through the existing retry path
>> unchanged. This is the smallest set we can identify with certainty
>> by page type.
>>
>> Wire the helper into the top of get_any_page() to short-circuit
>> those pages before the retry loop runs. On a hit, drop the caller's
>> MF_COUNT_INCREASED reference (if any) and return -ENOTRECOVERABLE
>> straight away. Pages outside the helper's positive list still take
>> the existing retry path and return -EIO, leaving operator-visible
>> behaviour for those cases unchanged.
>>
>> Extend the unhandlable-page pr_err() to fire for either errno and
>> update the get_hwpoison_page() kerneldoc to document the new return.
>>
>> memory_failure() still folds every negative return into
>> MF_MSG_GET_HWPOISON via its existing "else if (res < 0)" branch, so
>> this patch on its own only changes the errno that soft_offline_page()
>> can propagate to its callers. A follow-up wires -ENOTRECOVERABLE
>> through memory_failure() and reports MF_MSG_KERNEL for the
>> unrecoverable cases, which is what the
>> panic_on_unrecoverable_memory_failure sysctl observes.
>
> Thanks for your patch.
>
>>
>> Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
>> Suggested-by: Lance Yang <lance.yang@xxxxxxxxx>
>> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
>> ---
>> mm/memory-failure.c | 42 ++++++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index f4d3e6e20e13..8f63bdfeff8f 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1325,6 +1325,28 @@ static inline bool HWPoisonHandlable(struct page *page, unsigned long flags)
>> return PageLRU(page) || is_free_buddy_page(page);
>> }
>>
>> +/*
>> + * Positive identification of pages the hwpoison handler cannot recover.
>> + * These page types are owned by kernel internals (no userspace mapping
>> + * to unmap, no file mapping to invalidate, no migration target), so the
>> + * shake_page() / retry loop in get_any_page() can never turn them into
>> + * something HWPoisonHandlable() will accept. Short-circuit them to
>> + * -ENOTRECOVERABLE so callers can panic on operator request instead of
>> + * spinning through retries that exit as a transient-looking -EIO.
>> + *
>> + * The MF_SOFT_OFFLINE / page_has_movable_ops() opt-out mirrors
>> + * HWPoisonHandlable(): soft-offline is allowed to migrate movable_ops
>> + * pages even though they are not on the LRU.
>> + */
>> +static inline bool HWPoisonKernelOwned(struct page *page, unsigned long flags)
>> +{
>> + if ((flags & MF_SOFT_OFFLINE) && page_has_movable_ops(page))
>> + return false;
>> +
>> + return PageReserved(page) || PageSlab(page) ||
>
> Once shake_page finds a lightweight range-based way to shrink slab, slab pages could be freed
> into buddy and above PageSlab test should be removed then. Maybe add a TODO or XXX here?
>
>> + PageTable(page) || PageLargeKmalloc(page);
>
> I'm not sure but is it safe or a common way to test PageReserved, PageSlab,
> PageTable and PageLargeKmalloc without extra page refcnt?
Checking typed pages in a racy fashion is fine (PageSlab, PageTable,
PageLargeKmalloc).
Checking PageReserved in a racy fashion is fine as well. TESTPAGEFLAG() will
allow checking it on compound pages.
For PageLargeKmalloc, we would want to check the head page, though. The page
type is only stored for the head page.
So maybe we want to lookup the compound head (if any) and perform the type
checks against that?
--
Cheers,
David