Re: [PATCH v4 2/3] mm/memory-failure: add panic option for unrecoverable pages

From: Breno Leitao

Date: Wed Apr 22 2026 - 11:26:28 EST


Hello Miaohe,

On Wed, Apr 22, 2026 at 11:36:11AM +0800, Miaohe Lin wrote:
> On 2026/4/15 20:55, Breno Leitao wrote:
> > Add a sysctl panic_on_unrecoverable_memory_failure that triggers a
> > kernel panic when memory_failure() encounters pages that cannot be
> > recovered. This provides a clean crash with useful debug information
> > rather than allowing silent data corruption.
> >
> > The panic is triggered for three categories of unrecoverable failures,
> > all requiring result == MF_IGNORED:
> >
> > - MF_MSG_KERNEL: reserved pages identified via PageReserved.
> >
> > - MF_MSG_KERNEL_HIGH_ORDER: pages with refcount 0 that are not in the
> > buddy allocator (e.g., tail pages of high-order kernel allocations).
> > A TOCTOU race between get_hwpoison_page() and is_free_buddy_page()
> > is possible when CONFIG_DEBUG_VM is disabled, since check_new_pages()
> > is gated by is_check_pages_enabled() and becomes a no-op. Panicking
> > is still correct: the physical memory has a hardware error regardless
> > of who allocated the page.
>
> What if the page is used by userspace? We can recover from later accessing.
> Would panic here be overkill?

A userspace page should not reach the MF_MSG_KERNEL_HIGH_ORDER branch. The
branch is gated on get_hwpoison_page() == 0, i.e., folio_try_get() observed
_refcount == 0, and that condition rules out a live userspace mapping, no?

are you suggesting I drop MF_MSG_KERNEL_HIGH_ORDER from here, or, document this
will not hit userspace pages?

Thanks for the review,
--breno