Re: [PATCH v2] mm/page_alloc: minor clean up for memmap_init_compound()

From: Muchun Song
Date: Sun Jun 12 2022 - 11:46:55 EST


On Sat, Jun 11, 2022 at 10:13:52AM +0800, Miaohe Lin wrote:
> Since commit 5232c63f46fd ("mm: Make compound_pincount always available"),
> compound_pincount_ptr is stored at first tail page now. So we should call
> prep_compound_head() after the first tail page is initialized to take
> advantage of the likelihood of that tail struct page being cached given
> that we will read them right after in prep_compound_head().
>
> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> Cc: Joao Martins <joao.m.martins@xxxxxxxxxx>
> ---
> v2:
> Don't move prep_compound_head() outside loop per Joao.
> ---
> mm/page_alloc.c | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4c7d99ee58b4..048df5d78add 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6771,13 +6771,18 @@ static void __ref memmap_init_compound(struct page *head,
> set_page_count(page, 0);
>
> /*
> - * The first tail page stores compound_mapcount_ptr() and
> - * compound_order() and the second tail page stores
> - * compound_pincount_ptr(). Call prep_compound_head() after
> - * the first and second tail pages have been initialized to
> - * not have the data overwritten.
> + * The first tail page stores compound_mapcount_ptr(),
> + * compound_order() and compound_pincount_ptr(). Call
> + * prep_compound_head() after the first tail page have
> + * been initialized to not have the data overwritten.
> + *
> + * Note the idea to make this right after we initialize
> + * the offending tail pages is trying to take advantage
> + * of the likelihood of those tail struct pages being
> + * cached given that we will read them right after in
> + * prep_compound_head().
> */
> - if (pfn == head_pfn + 2)
> + if (unlikely(pfn == head_pfn + 1))
> prep_compound_head(head, order);

For me it is weird not to put this out of the loop. I saw the reason
is because of the caching suggested by Joao. But I think this is not
a hot path and putting it out of the loop may be more intuitive at least
for me. Maybe this optimization is unnecessary (maybe I am wrong).
And it will be consistent with prep_compound_page() (at least it does
not do the similar optimization) if we drop this optimization.

Hi Joao,

I am wondering is it a significant optimization for zone device memory?
I found this code existed from the 1st version you introduced. So
I suspect maybe you have some numbers, would you like to share with us?

Thanks.