Re: [PATCH 00/11] PageSlab: eliminate unnecessary compound_head() calls

From: Matthew Wilcox
Date: Tue Oct 12 2021 - 15:25:30 EST


On Tue, Oct 12, 2021 at 02:01:37PM -0400, Johannes Weiner wrote:
> PageSlab() currently imposes a compound_head() call on all callsites
> even though only very few situations encounter tailpages. This short
> series bubbles tailpage resolution up to the few sites that need it,
> and eliminates it everywhere else.
>
> This is a stand-alone improvement. However, it's inspired by Willy's
> patches to split struct slab from struct page. Those patches currently
> resolve a slab object pointer to its struct slab as follows:
>
> slab = virt_to_slab(p); /* tailpage resolution */
> if (slab_test_cache(slab)) { /* slab or page alloc bypass? */
> do_slab_stuff(slab);
> } else {
> page = (struct page *)slab;
> do_page_stuff(page);
> }
>
> which makes struct slab an ambiguous type that needs to self-identify
> with the slab_test_cache() test (which in turn relies on PG_slab in
> the flags field shared between page and slab).
>
> It would be preferable to do:
>
> page = virt_to_head_page(p); /* tailpage resolution */
> if (PageSlab(page)) { /* slab or page alloc bypass? */
> slab = page_slab(page);
> do_slab_stuff(slab);
> } else {
> do_page_stuff(page);
> }
>
> and leave the ambiguity and the need to self-identify with struct
> page, so that struct slab is a strong and unambiguous type, and a
> non-NULL struct slab encountered in the wild is always a valid object
> without the need to check another dedicated flag for validity first.
>
> However, because PageSlab() currently implies tailpage resolution,
> writing the virt->slab lookup in the preferred way would add yet more
> unnecessary compound_head() call to the hottest MM paths.
>
> The page flag helpers should eventually all be weaned off of those
> compound_head() calls for their unnecessary overhead alone. But this
> one in particular is now getting in the way of writing code in the
> preferred manner, and bleeding page ambiguity into the new types that
> are supposed to eliminate specifically that. It's ripe for a cleanup.

So what I had in mind was more the patch at the end (which I now realise
is missing the corresponding changes to __ClearPageSlab()). There is,
however, some weirdness with kfence, which appears to be abusing PageSlab
by setting it on pages which are not slab pages???

page = virt_to_page(p);
if (PageSlab(page)) { /* slab or page alloc bypass? */
slab = page_slab(page); /* tail page resolution */
do_slab_stuff(slab);
} else {
do_page_stuff(page); /* or possibly compound_head(page) */
}

There could also be a PageTail check in there for some of the cases --
catch people doing something like:
kfree(kmalloc(65536, GFP_KERNEL) + 16384);
which happens to work today, but should probably not.

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 8951da34aa51..b4b62fa100eb 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -344,7 +344,7 @@ PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
TESTCLEARFLAG(Active, active, PF_HEAD)
PAGEFLAG(Workingset, workingset, PF_HEAD)
TESTCLEARFLAG(Workingset, workingset, PF_HEAD)
-__PAGEFLAG(Slab, slab, PF_NO_TAIL)
+__PAGEFLAG(Slab, slab, PF_ANY)
__PAGEFLAG(SlobFree, slob_free, PF_NO_TAIL)
PAGEFLAG(Checked, checked, PF_NO_COMPOUND) /* Used by some filesystems */

diff --git a/mm/slab.c b/mm/slab.c
index d0f725637663..c8c6e8576936 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1371,6 +1371,7 @@ static struct page *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
int nodeid)
{
struct page *page;
+ int i;

flags |= cachep->allocflags;

@@ -1381,7 +1382,8 @@ static struct page *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
}

account_slab_page(page, cachep->gfporder, cachep, flags);
- __SetPageSlab(page);
+ for (i = 0; i < compound_nr(page); i++)
+ __SetPageSlab(page + i);
/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
if (sk_memalloc_socks() && page_is_pfmemalloc(page))
SetPageSlabPfmemalloc(page);
diff --git a/mm/slub.c b/mm/slub.c
index f7ac28646580..e442cebda4ef 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1916,7 +1916,8 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
account_slab_page(page, oo_order(oo), s, flags);

page->slab_cache = s;
- __SetPageSlab(page);
+ for (idx = 0; idx < compound_nr(page); idx++)
+ __SetPageSlab(page + idx);
if (page_is_pfmemalloc(page))
SetPageSlabPfmemalloc(page);