Re: [syzbot] [mm?] WARNING in memory_failure
From: David Hildenbrand
Date: Thu Oct 02 2025 - 03:25:53 EST
On 02.10.25 01:58, jane.chu@xxxxxxxxxx wrote:
Hi, Zi Yan,
On 9/30/2025 9:51 PM, syzbot wrote:
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
lost connection to test machine
Tested on:
commit: d8795075 mm/huge_memory: do not change split_huge_page..
git tree: https://github.com/x-y-z/linux-dev.git fix_split_page_min_order-for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=17ce96e2580000
kernel config: https://syzkaller.appspot.com/x/.config?x=714d45b6135c308e
dashboard link: https://syzkaller.appspot.com/bug?extid=e6367ea2fdab6ed46056
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
userspace arch: arm64
Note: no patches were applied.
My hunch is that
https://github.com/x-y-z/linux-dev.git
fix_split_page_min_order-for-kernelci
alone is not enough. Perhaps on ARM64, the page cache pages of
/dev/nullb0 in the test case are probably with min_order > 0, therefore
THP split fails, as the console message show:
[ 200.378989][T18221] Memory failure: 0x124d30: recovery action for
unsplit thp: Failed
With lots of poisoned THP pages stuck in the page cache, OOM could
trigger too soon.
I think it's worth to try add the additional changes I suggested earlier -
https://lore.kernel.org/lkml/7577871f-06be-492d-b6d7-8404d7a045e0@xxxxxxxxxx/
I think that makes sense in this case. I earlier said that I don't think
even splitting makes sense in this case, but as you say we can actually
at least allow for reclaiming the remainder of the folio.
Even though we cannot proceed in handling the remaining large folio
later on.
--
Cheers
David / dhildenb