Re: [RFC V2] mm: add the zero case to page[1].compound_nr in set_compound_order

From: Nico Pache
Date: Wed Dec 14 2022 - 20:07:07 EST


On Tue, Dec 13, 2022 at 11:38 PM Sidhartha Kumar
<sidhartha.kumar@xxxxxxxxxx> wrote:
>
> On 12/13/22 5:02 PM, Mike Kravetz wrote:
> > On 12/13/22 17:27, Nico Pache wrote:
> >> According to the document linked the following approach is even faster
> >> than the one I used due to CPU parallelization:
> >
> > I do not think we are very concerned with speed here. This routine is being
> > called in the creation of compound pages, and in the case of hugetlb the
> > tear down of gigantic pages. In general, creation and tear down of gigantic
> > pages happens infrequently. Usually only at system/application startup and
> > system/application shutdown.
> >
> Hi Nico,
>
> I wrote a bpftrace script to track the time spent in
> __prep_compound_gigantic_folio both with and without the branch in
> folio_set_order() and resulting histogram was the same for both
> versions. This is probably because the for loop through every base page
> has a much higher overhead than the singular call to folio_set_order().
> I am not sure what the performance difference for THP would be.

Hi Sidhartha,

Ok great! We may want to proactively implement a branchless version so
once/if THP comes around to utilizing this we won't see a regression.

Furthermore, Waiman brought up a good point off the list:
This bitmath is needlessly complex and can be achieved with
page[1].compound_nr = (1U << order) & ~1U;

Tested:
order 0 output : 0
order 1 output : 2
order 2 output : 4
order 3 output : 8
order 4 output : 16
order 5 output : 32
order 6 output : 64
order 7 output : 128
order 8 output : 256
order 9 output : 512
order 10 output : 1024


> Below is the script.
> Thanks,
> Sidhartha Kumar

Thanks for the script!!
Cheers,
-- Nico

> k:__prep_compound_gigantic_folio
> {
> @prep_start[pid] = nsecs;
> }
>
> kr:__prep_compound_gigantic_folio
> {
> @prep_nsecs = hist((nsecs - @prep_start[pid]));
> delete(@prep_start[pid]);
> }