Re: [PATCH v10 00/33] Memory folios

From: Matteo Croce
Date: Tue Jun 08 2021 - 10:57:03 EST


On Fri, Jun 4, 2021 at 4:13 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Fri, Jun 04, 2021 at 03:07:12AM +0200, Matteo Croce wrote:
> > On Tue, 11 May 2021 22:47:02 +0100
> > "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> wrote:
> >
> > > We also waste a lot of instructions ensuring that we're not looking at
> > > a tail page. Almost every call to PageFoo() contains one or more
> > > hidden calls to compound_head(). This also happens for get_page(),
> > > put_page() and many more functions. There does not appear to be a
> > > way to tell gcc that it can cache the result of compound_head(), nor
> > > is there a way to tell it that compound_head() is idempotent.
> > >
> >
> > Maybe it's not effective in all situations but the following hint to
> > the compiler seems to have an effect, at least according to bloat-o-meter:
>
> It definitely has an effect ;-)
>
> Note that a function that has pointer arguments and examines the
> data pointed to must _not_ be declared 'const' if the pointed-to
> data might change between successive invocations of the function.
> In general, since a function cannot distinguish data that might
> change from data that cannot, const functions should never take
> pointer or, in C++, reference arguments. Likewise, a function that
> calls a non-const function usually must not be const itself.
>
> So that's not going to work because a call to split_huge_page() won't
> tell the compiler that it's changed.
>
> Reading the documentation, we might be able to get away with marking the
> function as pure:
>
> The 'pure' attribute imposes similar but looser restrictions on a
> function's definition than the 'const' attribute: 'pure' allows the
> function to read any non-volatile memory, even if it changes in
> between successive invocations of the function.
>
> although that's going to miss opportunities, since taking a lock will
> modify the contents of struct page, meaning the compiler won't cache
> the results of compound_head().
>
> > $ scripts/bloat-o-meter vmlinux.o.orig vmlinux.o
> > add/remove: 3/13 grow/shrink: 65/689 up/down: 21080/-198089 (-177009)
>
> I assume this is an allyesconfig kernel? I think it's a good
> indication of how much opportunity there is.
>

Yes, it's an allyesconfig kernel.
I did the same with pure:

$ git diff
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 04a34c08e0a6..548b72b46eb1 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -179,7 +179,7 @@ enum pageflags {

struct page; /* forward declaration */

-static inline struct page *compound_head(struct page *page)
+static inline __pure struct page *compound_head(struct page *page)
{
unsigned long head = READ_ONCE(page->compound_head);


$ scripts/bloat-o-meter vmlinux.o.orig vmlinux.o
add/remove: 3/13 grow/shrink: 63/689 up/down: 20910/-192081 (-171171)
Function old new delta
ntfs_mft_record_alloc 14414 16627 +2213
migrate_pages 8891 10819 +1928
ext2_get_page.isra 1029 2343 +1314
kfence_init 180 1331 +1151
page_remove_rmap 754 1893 +1139
f2fs_fsync_node_pages 4378 5406 +1028
[...]
migrate_page_states 7088 4842 -2246
ntfs_mft_record_format 2940 - -2940
lru_deactivate_file_fn 9220 6277 -2943
shrink_page_list 20653 15749 -4904
page_memcg 5149 193 -4956
Total: Before=388869713, After=388698542, chg -0.04%

$ ls -l vmlinux.o.orig vmlinux.o
-rw-rw-r-- 1 mcroce mcroce 1295502680 Jun 8 16:47 vmlinux.o
-rw-rw-r-- 1 mcroce mcroce 1295934624 Jun 8 16:28 vmlinux.o.orig

vmlinux is ~420 kb smaller..

--
per aspera ad upstream