Re: WARNING in memory_failure() at include/linux/huge_mm.h:635 triggered
From: David Hildenbrand (arm)
Date: Wed Feb 04 2026 - 12:17:17 EST
On 2/4/26 18:12, David Hildenbrand (arm) wrote:
On 2/4/26 13:49, 是参差 wrote:
Hi,
I’m reporting a reproducible WARNING triggered in the hwpoison / memory_failure path when injecting a hardware-poison event via madvise(MADV_HWPOISON).
The warning is triggered by a syzkaller C reproducer that:
maps a file-backed region with MAP_FIXED, touches related VMAs, and then
calls madvise() with MADV_HWPOISON over a large range.
The kernel reports a VM_WARN_ON_ONCE_FOLIO(1) from memory_failure() and points to include/linux/huge_mm.h:635, suggesting an unexpected folio/page state encountered while handling a poisoned compound/huge folio.
The target page appears to be a compound head page (order:3) already marked hwpoison. memory_failure() seems to reach a branch that unconditionally warns (VM_WARN_ON_ONCE_FOLIO(1) at include/linux/ huge_mm.h:635), which usually indicates an “unreachable”/unexpected folio type or state transition in the huge/compound folio handling logic during hwpoison processing.
This looks like a kernel-side invariant violation rather than a pure userspace misuse, since the warning is emitted from an unconditional VM_WARN_ON_ONCE_FOLIO(1) site.
Reproducer:
C reproducer: https://pastebin.com/raw/UxennX2B
console output: https://pastebin.com/raw/wrhKRwZY
kernel config: https://pastebin.com/raw/dP93yBLn
Kernel:
HEAD commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
git tree: torvalds/linux
kernel version: 6.19.0-rc7 (QEMU Ubuntu 24.10)
@Zi Yan, this is weird.
We run into the VM_WARN_ON_ONCE_FOLIO(1, folio); in min_order_for_split(),
which is only active with !CONFIG_TRANSPARENT_HUGEPAGE.
But how do we get a large folio in that case? folio_test_large(folio) succeeded.
I think we rules out hugetlb before in that function.
Looking into the full console output, this is an order-3 folio (fully mapped).
How do we end up with a large folio here? I am only aware of that happening when something would
allocate an order-3 compound page (not a folio) and map it into the page tables. Yes, that
is nasty and can still happen, not sure yet though whether that is really what the reproducer
triggers.
Looking again,
mapping:0000000000000000 index:0xffff88800fe2e600
At least mapping==0 could indicate a non-folio thing.
--
Cheers,
David