Re: Folios for 5.15 request - Was: re: Folio discussion recap -
From: Matthew Wilcox
Date: Mon Oct 25 2021 - 11:55:50 EST
On Mon, Oct 25, 2021 at 11:35:25AM -0400, Johannes Weiner wrote:
> On Fri, Oct 22, 2021 at 02:52:31AM +0100, Matthew Wilcox wrote:
> > > Anyway. I can even be convinved that we can figure out the exact fault
> > > lines along which we split the page down the road.
> > >
> > > My worry is more about 2). A shared type and generic code is likely to
> > > emerge regardless of how we split it. Think about it, the only world
> > > in which that isn't true would be one in which either
> > >
> > > a) page subtypes are all the same, or
> > > b) the subtypes have nothing in common
> > >
> > > and both are clearly bogus.
> >
> > Amen!
> >
> > I'm convinced that pgtable, slab and zsmalloc uses of struct page can all
> > be split out into their own types instead of being folios. They have
> > little-to-nothing in common with anon+file; they can't be mapped into
> > userspace and they can't be on the LRU. The only situation you can find
> > them in is something like compaction which walks PFNs.
>
> They can all be accounted to a cgroup. pgtables are tracked the same
> as other __GFP_ACCOUNT pages (pipe buffers and kernel stacks right now
> from a quick grep, but as you can guess that's open-ended).
Oh, this is good information!
> So if those all aren't folios, the generic type and the interfacing
> object for memcg and accounting would continue to be the page.
>
> > Perhaps you could comment on how you'd see separate anon_mem and
> > file_mem types working for the memcg code? Would you want to have
> > separate lock_anon_memcg() and lock_file_memcg(), or would you want
> > them to be cast to a common type like lock_folio_memcg()?
>
> That should be lock_<generic>_memcg() since it actually serializes and
> protects the same thing for all subtypes (unlike lock_page()!).
>
> The memcg interface is fully type agnostic nowadays, but it also needs
> to be able to handle any subtype. It should continue to interface with
> the broadest, most generic definition of "chunk of memory".
Some of the memory descriptors might prefer to keep their memcg_data at a
different offset from the start of the struct. Can we accommodate that,
or do we ever get handed a specialised memory descriptor, then have to
cast back to an unspecialised descriptor?
(the LRU list would be an example of this; the list_head must be at the
same offset in all memory descriptors which use the LRU list)