Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM)
From: Gregory Price
Date: Thu Jun 18 2026 - 07:20:17 EST
On Thu, Jun 18, 2026 at 10:21:30AM +0200, Vlastimil Babka (SUSE) wrote:
> On 6/15/26 17:37, Gregory Price wrote:
> >
> > One thought would be a way to switch what fallback list is used, and
> > then have specific fallback lists for certain contexts.
> >
> > Right now there is a single example of this: __GFP_THISNODE
> > |= __GFP_THISNODE => NOFALLBACK
> > &= ~__GFP_THISNODE => FALLBACK
> >
> > We could add an interface with the desired fallback list based as an
> > argument, and let get_page_from_freelist to prefer that over the default
> > global lists.
>
> Does it mean a new argument in a number of functions in the page allocator,
> or can it be mapped to alloc_flags (at least internally?), because the
> number of possible fallback lists is small enough?
>
What I ended up with was adding a single page_alloc.c external interface
that allows you define the zonelist via an enum, and then an internal
selector resolution in prepare_alloc_pages() stored in alloc_context
eg:
static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
int preferred_nid, nodemask_t *nodemask,
struct alloc_context *ac, gfp_t *alloc_gfp,
unsigned int *alloc_flags)
{
ac->highest_zoneidx = gfp_zone(gfp_mask);
ac->zonelist = select_zonelist(preferred_nid, gfp_mask, ac->zlsel);
... snip ...
}
struct folio *__folio_alloc_zonelist_noprof(gfp_t gfp, unsigned int order,
int preferred_nid, nodemask_t *nodemask,
enum alloc_zonelist zlsel);
The original __folio_alloc* functions just add a DEFAULT - which tells
select_zonelist() to base the decision on __GFP_THISNODE.
struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_nid,
nodemask_t *nodemask)
{
return __folio_alloc_core(gfp, order, preferred_nid, nodemask,
ALLOC_ZONELIST_DEFAULT);
}
EXPORT_SYMBOL(__folio_alloc_noprof);
This does a few things
- The isolation is structural, there is no way to accidentally
allocate private memory without passing ALLOC_ZONELIST_PRIVATE
- The isolation forces folios - there are no non-folio interfaces
which allow zonelist selection
- The zonelist selection is confined to this allocation context,
so no inheritence is possible.
I tried to avoid using an ALLOC_ flag so we can avoid yet another flag
crunch, but there certainly are few enough zonelists that we could
encode it there and expose it. I know Brendan was looking at plumbing
alloc flags out to an interface, so i'm open to that.
Externally the way I determine what zonelist to use is a lookup based on
reason - letting the node filter. This is really only needed in a
couple spots:
mm/khugepaged.c: enum alloc_zonelist zlsel = alloc_zonelist_for_node(node, NODE_ALLOC_RECLAIM);
mm/vmscan.c: mtc->zlsel = alloc_zonelist_for_nodemask(mtc->nmask, NODE_ALLOC_TIERING);
mm/migrate.c: .zlsel = alloc_zonelist_for_node(node, NODE_ALLOC_USER_MIGRATE),
static inline enum alloc_zonelist
alloc_zonelist_for_node(int nid, enum node_alloc_reason reason)
{
bool ok;
if (!node_state(nid, N_MEMORY_PRIVATE))
return ALLOC_ZONELIST_DEFAULT;
switch (reason) {
case NODE_ALLOC_RECLAIM:
ok = node_is_reclaimable(nid);
break;
case NODE_ALLOC_TIERING:
ok = node_allows_tiering(nid);
break;
case NODE_ALLOC_USER_MIGRATE:
ok = node_allows_user_migrate(nid);
break;
default:
ok = false;
}
return ok ? ALLOC_ZONELIST_PRIVATE : ALLOC_ZONELIST_DEFAULT;
}
Otherwise... everything is now a mempolicy w/ MPOL_F_BIND and all the
handling goes through the normal fault-paths :]
static struct page *__alloc_pages_mpol(gfp_t gfp, unsigned int order,
struct mempolicy *pol, pgoff_t ilx, int nid)
{
nodemask_t *nodemask;
struct page *page;
enum alloc_zonelist zlsel = (pol->flags & MPOL_F_PRIVATE) ?
ALLOC_ZONELIST_PRIVATE : ALLOC_ZONELIST_DEFAULT;
...
if (pol->mode == MPOL_PREFERRED_MANY)
return alloc_pages_preferred_many(gfp, order, nid, nodemask,
zlsel);
...
}
Switching to an alloc_flag would probably be trivially if that's really
wanted
~Gregory