Re: Folios for 5.15 request - Was: re: Folio discussion recap -

From: Johannes Weiner
Date: Mon Oct 25 2021 - 11:35:31 EST


On Fri, Oct 22, 2021 at 02:52:31AM +0100, Matthew Wilcox wrote:
> > Anyway. I can even be convinved that we can figure out the exact fault
> > lines along which we split the page down the road.
> >
> > My worry is more about 2). A shared type and generic code is likely to
> > emerge regardless of how we split it. Think about it, the only world
> > in which that isn't true would be one in which either
> >
> > a) page subtypes are all the same, or
> > b) the subtypes have nothing in common
> >
> > and both are clearly bogus.
>
> Amen!
>
> I'm convinced that pgtable, slab and zsmalloc uses of struct page can all
> be split out into their own types instead of being folios. They have
> little-to-nothing in common with anon+file; they can't be mapped into
> userspace and they can't be on the LRU. The only situation you can find
> them in is something like compaction which walks PFNs.

They can all be accounted to a cgroup. pgtables are tracked the same
as other __GFP_ACCOUNT pages (pipe buffers and kernel stacks right now
from a quick grep, but as you can guess that's open-ended).

So if those all aren't folios, the generic type and the interfacing
object for memcg and accounting would continue to be the page.

> Perhaps you could comment on how you'd see separate anon_mem and
> file_mem types working for the memcg code? Would you want to have
> separate lock_anon_memcg() and lock_file_memcg(), or would you want
> them to be cast to a common type like lock_folio_memcg()?

That should be lock_<generic>_memcg() since it actually serializes and
protects the same thing for all subtypes (unlike lock_page()!).

The memcg interface is fully type agnostic nowadays, but it also needs
to be able to handle any subtype. It should continue to interface with
the broadest, most generic definition of "chunk of memory".

Notably it does not do tailpages (and I don't see how it ever would),
so it could in theory use the folio - but only if the folio is really
the systematic replacement of absolutely *everything* that isn't a
tailpage - including pgtables, kernel stack, pipe buffers, and all
other random alloc_page() calls spread throughout the code base. Not
just conceptually, but an actual wholesale replacement of struct page
throughout allocation sites.

I'm not sure that's realistic. So I'm thinking struct page will likely
be the interfacing object for memcg for the foreseeable future.