Re: [PATCH v4 19/36] mm: create new codetag references during page splitting

From: Suren Baghdasaryan
Date: Tue Feb 27 2024 - 11:41:45 EST


On Tue, Feb 27, 2024 at 2:10 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 2/21/24 20:40, Suren Baghdasaryan wrote:
> > When a high-order page is split into smaller ones, each newly split
> > page should get its codetag. The original codetag is reused for these
> > pages but it's recorded as 0-byte allocation because original codetag
> > already accounts for the original high-order allocated page.
>
> This was v3 but then you refactored (for the better) so the commit log
> could reflect it?

Yes, technically mechnism didn't change but I should word it better.
Smth like this:

When a high-order page is split into smaller ones, each newly split
page should get its codetag. After the split each split page will be
referencing the original codetag. The codetag's "bytes" counter
remains the same because the amount of allocated memory has not
changed, however the "calls" counter gets increased to keep the
counter correct when these individual pages get freed.

>
> > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
>
> I was going to R-b, but now I recalled the trickiness of
> __free_pages() for non-compound pages if it loses the race to a
> speculative reference. Will the codetag handling work fine there?

I think so. Each non-compoud page has its individual reference to its
codetag and will decrement it whenever the page is freed. IIUC the
logic in __free_pages(), when it loses race to a speculative
reference it will free all pages except for the first one and the
first one will be freed when the last put_page() happens. If prior to
this all these pages were split from one page then all of them will
have their own reference which points to the same codetag. Every time
one of these pages are freed that codetag's "bytes" and "calls"
counters will be decremented. I think accounting will work correctly
irrespective of where these pages are freed, in __free_pages() or by
put_page().

>
> > ---
> > include/linux/pgalloc_tag.h | 30 ++++++++++++++++++++++++++++++
> > mm/huge_memory.c | 2 ++
> > mm/page_alloc.c | 2 ++
> > 3 files changed, 34 insertions(+)
> >
> > diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h
> > index b49ab955300f..9e6ad8e0e4aa 100644
> > --- a/include/linux/pgalloc_tag.h
> > +++ b/include/linux/pgalloc_tag.h
> > @@ -67,11 +67,41 @@ static inline void pgalloc_tag_sub(struct page *page, unsigned int order)
> > }
> > }
> >
> > +static inline void pgalloc_tag_split(struct page *page, unsigned int nr)
> > +{
> > + int i;
> > + struct page_ext *page_ext;
> > + union codetag_ref *ref;
> > + struct alloc_tag *tag;
> > +
> > + if (!mem_alloc_profiling_enabled())
> > + return;
> > +
> > + page_ext = page_ext_get(page);
> > + if (unlikely(!page_ext))
> > + return;
> > +
> > + ref = codetag_ref_from_page_ext(page_ext);
> > + if (!ref->ct)
> > + goto out;
> > +
> > + tag = ct_to_alloc_tag(ref->ct);
> > + page_ext = page_ext_next(page_ext);
> > + for (i = 1; i < nr; i++) {
> > + /* Set new reference to point to the original tag */
> > + alloc_tag_ref_set(codetag_ref_from_page_ext(page_ext), tag);
> > + page_ext = page_ext_next(page_ext);
> > + }
> > +out:
> > + page_ext_put(page_ext);
> > +}
> > +
> > #else /* CONFIG_MEM_ALLOC_PROFILING */
> >
> > static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
> > unsigned int order) {}
> > static inline void pgalloc_tag_sub(struct page *page, unsigned int order) {}
> > +static inline void pgalloc_tag_split(struct page *page, unsigned int nr) {}
> >
> > #endif /* CONFIG_MEM_ALLOC_PROFILING */
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 94c958f7ebb5..86daae671319 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -38,6 +38,7 @@
> > #include <linux/sched/sysctl.h>
> > #include <linux/memory-tiers.h>
> > #include <linux/compat.h>
> > +#include <linux/pgalloc_tag.h>
> >
> > #include <asm/tlb.h>
> > #include <asm/pgalloc.h>
> > @@ -2899,6 +2900,7 @@ static void __split_huge_page(struct page *page, struct list_head *list,
> > /* Caller disabled irqs, so they are still disabled here */
> >
> > split_page_owner(head, nr);
> > + pgalloc_tag_split(head, nr);
> >
> > /* See comment in __split_huge_page_tail() */
> > if (PageAnon(head)) {
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 58c0e8b948a4..4bc5b4720fee 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2621,6 +2621,7 @@ void split_page(struct page *page, unsigned int order)
> > for (i = 1; i < (1 << order); i++)
> > set_page_refcounted(page + i);
> > split_page_owner(page, 1 << order);
> > + pgalloc_tag_split(page, 1 << order);
> > split_page_memcg(page, 1 << order);
> > }
> > EXPORT_SYMBOL_GPL(split_page);
> > @@ -4806,6 +4807,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order,
> > struct page *last = page + nr;
> >
> > split_page_owner(page, 1 << order);
> > + pgalloc_tag_split(page, 1 << order);
> > split_page_memcg(page, 1 << order);
> > while (page < --last)
> > set_page_refcounted(last);