Re: [GIT PULL] Memory folios for v5.15
From: Matthew Wilcox
Date: Fri Aug 27 2021 - 08:06:37 EST
On Fri, Aug 27, 2021 at 06:03:25AM -0400, Johannes Weiner wrote:
> At the current stage of conversion, folio is a more clearly delineated
> API of what can be safely used from the FS for the interaction with
> the page cache and memory management. And it looks still flexible to
> make all sorts of changes, including how it's backed by
> memory. Compared with the page, where parts of the API are for the FS,
> but there are tons of members, functions, constants, and restrictions
> due to the page's role inside MM core code. Things you shouldn't be
> using, things you shouldn't be assuming from the fs side, but it's
> hard to tell which is which, because struct page is a lot of things.
>
> However, the MM narrative for folios is that they're an abstraction
> for regular vs compound pages. This is rather generic. Conceptually,
> it applies very broadly and deeply to MM core code: anonymous memory
> handling, reclaim, swapping, even the slab allocator uses them. If we
> follow through on this concept from the MM side - and that seems to be
> the plan - it's inevitable that the folio API will grow more
> MM-internal members, methods, as well as restrictions again in the
> process. Except for the tail page bits, I don't see too much in struct
> page that would not conceptually fit into this version of the folio.
So the superhypermegaultra ambitious version of this does something
like:
struct slab_page {
unsigned long flags;
union {
struct list_head slab_list;
struct {
...
};
};
struct kmem_cache *slab_cache;
void *freelist;
void *s_mem;
unsigned int active;
atomic_t _refcount;
unsigned long memcg_data;
};
struct folio {
... more or less as now ...
};
struct net_page {
unsigned long flags;
unsigned long pp_magic;
struct page_pool *pp;
unsigned long _pp_mapping_pad;
unsigned long dma_addr[2];
atomic_t _mapcount;
atomic_t _refcount;
unsigned long memcg_data;
};
struct page {
union {
struct folio folio;
struct slab_page slab;
struct net_page pool;
...
};
};
and then functions which only take one specific type of page use that
type. And the compiler will tell you that you can't pass a net_page
to a slab function, or vice versa.
This is a lot more churn, and I'm far from convinced that it's worth
doing. There's also the tricky "This page is mappable to userspace"
kind of functions, which (for example) includes vmalloc and net_page
as well as folios and random driver allocations, but shouldn't include
slab or page table pages. They're especially tricky because mapping to
userspace comes with rules around the use of the ->mapping field as well
as ->_mapcount.