Re: [PATCH] mm/page_alloc: make percpu_pagelist_high_fraction reads lock-free
From: David Hildenbrand (Red Hat)
Date: Wed Dec 03 2025 - 06:28:57 EST
On 12/3/25 10:23, Gregory Price wrote:
On Wed, Dec 03, 2025 at 10:08:55AM +0100, David Hildenbrand (Red Hat) wrote:
On 12/3/25 10:02, Gregory Price wrote:
My transient failure (although i'm not sure it was actually transient, i
killed it and retried after a few minutes and it succeeded immediately)
was on a ZONE_MOVABLE block.
Okay, so that one should not bail out. Longterm pinnins must never end up on
such memory, and if it happens, we have to identify why and fix it.
We have this known problem of "stream of short-term pinnings" that can
temporarily turn memory effectively unmovable. Juan will talk about that at
LPC [1].
Nice, fun, good topic. Looking forward to Japan n_n
We have another set of problematic cases (vmsplice(), fuse) but I would
assume that these are not the cases you are hitting.
We do use fuse, but this system was relatively quiet when i tried this.
We do have some proactive reclaim / demotion going on, but i don't think
it was that (see below).
Kind of suggested to me there was some bad condition the resolved once I
took a second to release the lock and try again.
Hard to tell I'm afraid. Do you still have the dump_folio() calls we print
when migration fails?
What luck, I do! :D
:)
And i just noticed it's the same page over and over
[ 3404.119270] migrating pfn c06f176 failed ret:1
[ 3404.129152] page: refcount:4 mapcount:0 mapping:0000000061ca20ba index:0xad28e5b pfn:0xc06f176
[ 3404.148284] memcg:ffff88842e855000
[ 3404.155834] aops:btree_aops ino:1
Small folio. Not GUP-pinned (FOLL_PIN, otherwise our refcount would be >= 1024.
It could be ordinary GUP (FOLL_GET) e.g., from vmsplice or some older O_DIRECT user that was not converted to FOLL_PIN yet. But maybe it's just btrfs / something else that temporarily holds a folio reference.
Given that this is from 6.13 ... hard to tell :)
[ 3404.163193] flags: 0x17ffff066c00420c(referenced|uptodate|workingset|private|node=1|zone=3|lastcpupid=0x1ffff)
Neither dirty nor under writeback.
--
Cheers
David