Re: [RFC] mm: page-flags.h: remove the bias against tail pages

From: John Hubbard
Date: Mon Mar 25 2024 - 23:22:01 EST


On 3/24/24 10:24 PM, Matthew Wilcox wrote:
..
It's complicated. On the one hand, it's "more likely" because there are
more tail pages than there are head pages or order-0 pages. On the
other hand, a _lot_ of the time we call compound_head(), it's done with
a non-tail page because we tend to pass around head pages (eg,

ah yes, that's true.

pmd_page() on hugetlbfs, or looking up a folio in the page cache and
passing &folio->page to some function that's not yet converted.

On the third hand, does the compiler really do much with the annotation?

Before your patch:

27d6: a8 01 test $0x1,%al
27d8: 75 02 jne 27dc <clear_refs_pte_range+0x9c>

I should have thought to check this. Usually I'll see a change between je/jne
if __builtin_expect is doing its job. Here it is, oddly, missing in action.

Maybe I'll look a little closer into why that is...

27da: eb 59 jmp 2835 <clear_refs_pte_range+0xf5>
27dc: 49 8b 44 24 08 mov 0x8(%r12),%rax
27e1: a8 01 test $0x1,%al
27e3: 75 6f jne 2854 <clear_refs_pte_range+0x114>
27e5: eb 73 jmp 285a <clear_refs_pte_range+0x11a>

With your patch:

1ee6: a8 01 test $0x1,%al
1ee8: 75 02 jne 1eec <clear_refs_pte_range+0x9c>
1eea: eb 5f jmp 1f4b <clear_refs_pte_range+0xfb>
1eec: 49 8b 44 24 08 mov 0x8(%r12),%rax
1ef1: a8 01 test $0x1,%al
1ef3: 75 50 jne 1f45 <clear_refs_pte_range+0xf5>
1ef5: eb 6c jmp 1f63 <clear_refs_pte_range+0x113>

Looks pretty much the same. bloat-o-meter says:

$ ./scripts/bloat-o-meter before.o after.o
add/remove: 0/0 grow/shrink: 2/4 up/down: 32/-48 (-16)
Function old new delta
gather_stats.constprop 730 753 +23
smaps_hugetlb_range 635 644 +9
smaps_page_accumulate 342 338 -4
clear_refs_pte_range 339 328 -11
pagemap_hugetlb_range 422 407 -15
smaps_pte_range 1406 1388 -18
Total: Before=20066, After=20050, chg -0.08%

(I was looking at clear_refs_pte_range above). This seems marginal.
The benefits of removing a call to compound_head are much less
ambiguous:

$ ./scripts/bloat-o-meter before.o .build/fs/proc/task_mmu.o
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-101 (-101)
Function old new delta
clear_refs_pte_range 339 238 -101
Total: Before=20066, After=19965, chg -0.50%

I'd describe that as replacing four calls to compound_head() with two:

- page = pmd_page(*pmd);
+ folio = page_folio(pmd_page(*pmd));

/* Clear accessed and referenced bits. */
pmdp_test_and_clear_young(vma, addr, pmd);
- test_and_clear_page_young(page);
- ClearPageReferenced(page);
+ folio_test_clear_young(folio);
+ folio_clear_referenced(folio);
...
- page = vm_normal_page(vma, addr, ptent);
- if (!page)
+ folio = vm_normal_folio(vma, addr, ptent);
+ if (!folio)
continue;

/* Clear accessed and referenced bits. */
ptep_test_and_clear_young(vma, addr, pte);
- test_and_clear_page_young(page);
- ClearPageReferenced(page);
+ folio_test_clear_young(folio);
+ folio_clear_referenced(folio);

I'm not saying this patch is necessarily wrong, I just think it's
"not proven".

I appreciate your looking at this and explaining the analysis steps
you used!


thanks,
--
John Hubbard
NVIDIA