Re: [PATCH v8 3/6] mm/memory-failure: report MF_MSG_KERNEL for unrecoverable kernel pages
From: David Hildenbrand (Arm)
Date: Mon Jun 01 2026 - 09:30:05 EST
On 5/27/26 16:06, Breno Leitao wrote:
> The previous patch teaches get_any_page() to return -ENOTRECOVERABLE
> for stable unhandlable kernel pages (PG_reserved, slab, page tables,
> large-kmalloc). memory_failure() still folds every negative return
> into MF_MSG_GET_HWPOISON, so callers that want to react to the
> unrecoverable cases (a panic option, smarter logging) cannot tell
> them apart from transient page-allocator races.
>
> Turn the post-call branch into a switch over the get_hwpoison_page()
> return code: map -ENOTRECOVERABLE to MF_MSG_KERNEL and any other
> negative return to MF_MSG_GET_HWPOISON. case 0 keeps the existing
> free-buddy / kernel-high-order handling and case 1 falls through to
> the rest of memory_failure() unchanged.
>
> The MF_MSG_KERNEL label and tracepoint string are kept as
> "reserved kernel page" to avoid breaking userspace tools that match
> on those literals; the enum value still adequately tags the failure
> even though it now also covers slab, page tables and large-kmalloc
> pages.
>
> Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> ---
> mm/memory-failure.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 8f63bdfeff8f..14c0a958638c 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -2426,7 +2426,8 @@ int memory_failure(unsigned long pfn, int flags)
> * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
> */
> res = get_hwpoison_page(p, flags);
> - if (!res) {
> + switch (res) {
> + case 0:
> if (is_free_buddy_page(p)) {
> if (take_page_off_buddy(p)) {
> page_ref_inc(p);
> @@ -2445,7 +2446,19 @@ int memory_failure(unsigned long pfn, int flags)
> res = action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED);
> }
> goto unlock_mutex;
> - } else if (res < 0) {
> + case 1:
> + /* Got a refcount on a handlable page. */
> + break;
> + case -ENOTRECOVERABLE:
> + /*
> + * Stable unhandlable kernel-owned page (PG_reserved,
> + * slab, page tables, large-kmalloc).
> + * No recovery possible.
> + */
> + res = action_result(pfn, MF_MSG_KERNEL, MF_IGNORED);
> + goto unlock_mutex;
> + default:
> + /* Transient lifecycle race with the page allocator. */
> res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
> goto unlock_mutex;
> }
>
Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
--
Cheers,
David