Re: [RFC v3 2/4] mm: move PG_slab flag to page_type
From: Matthew Wilcox
Date: Mon Jan 30 2023 - 00:12:40 EST
On Mon, Jan 30, 2023 at 01:34:59PM +0900, Hyeonggon Yoo wrote:
> > Seems like quite some changes to page_type to accomodate SLAB, which is
> > hopefully going away soon(TM). Could we perhaps avoid that?
>
> If it could be done with less changes, I'll try to avoid that.
Let me outline the idea I had for removing PG_slab:
Observe that PG_reserved and PG_slab are mutually exclusive. Also,
if PG_reserved is set, no other flags are set. If PG_slab is set, only
PG_locked is used. Many of the flags are only for use by anon/page
cache pages (eg referenced, uptodate, dirty, lru, active, workingset,
waiters, error, owner_priv_1, writeback, mappedtodisk, reclaim,
swapbacked, unevictable, mlocked).
Redefine PG_reserved as PG_kernel. Now we can use the other _15_
flags to indicate pagetype, as long as PG_kernel is set. So, eg
PageSlab() can now be (page->flags & PG_type) == PG_slab where
#define PG_kernel 0x00001
#define PG_type (PG_kernel | 0x7fff0)
#define PG_slab (PG_kernel | 0x00010)
#define PG_reserved (PG_kernel | 0x00020)
#define PG_buddy (PG_kernel | 0x00030)
#define PG_offline (PG_kernel | 0x00040)
#define PG_table (PG_kernel | 0x00050)
#define PG_guard (PG_kernel | 0x00060)
That frees up the existing PG_slab, lets us drop the page_type field
altogether and gives us space to define all the page types we might
want (eg PG_vmalloc)
We'll want to reorganise all the flags which are for anon/file pages
into a contiguous block. And now that I think about it, vmalloc pages
can be mapped to userspace, so they can get marked dirty, so only
14 bits are available. Maybe rearrange to ...
PG_locked 0x000001
PG_writeback 0x000002
PG_head 0x000004
PG_dirty 0x000008
PG_owner_priv_1 0x000010
PG_arch_1 0x000020
PG_private 0x000040
PG_waiters 0x000080
PG_kernel 0x000100
PG_referenced 0x000200
PG_uptodate 0x000400
PG_lru 0x000800
PG_active 0x001000
PG_workingset 0x002000
PG_error 0x004000
PG_private_2 0x008000
PG_mappedtodisk 0x010000
PG_reclaim 0x020000
PG_swapbacked 0x040000
PG_unevictable 0x080000
PG_mlocked 0x100000
... or something. There are a number of constraints and it may take
a few iterations to get this right. Oh, and if this is the layout
we use, then:
PG_type 0x1fff00
PG_reserved (PG_kernel | 0x200)
PG_slab (PG_kernel | 0x400)
PG_buddy (PG_kernel | 0x600)
PG_offline (PG_kernel | 0x800)
PG_table (PG_kernel | 0xa00)
PG_guard (PG_kernel | 0xc00)
PG_vmalloc (PG_kernel | 0xe00)
This is going to make show_page_flags() more complex :-P
Oh, and while we're doing this, we should just make PG_mlocked
unconditional. NOMMU doesn't need the extra space in page flags
(for what? their large number of NUMA nodes?)