Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

From: Colin Cross
Date: Tue Oct 07 2014 - 12:19:10 EST


On Tue, Oct 7, 2014 at 9:07 AM, PINTU KUMAR <pintu.k@xxxxxxxxxxx> wrote:
> ----- Original Message -----
>> From: Colin Cross <ccross@xxxxxxxxxxx>
>> To: pintu.k@xxxxxxxxxxx
>> Cc: Laura Abbott <lauraa@xxxxxxxxxxxxxx>; Heesub Shin <heesub.shin@xxxxxxxxxxx>; "akpm@xxxxxxxxxxxxxxxxxxxx" <akpm@xxxxxxxxxxxxxxxxxxxx>; "gregkh@xxxxxxxxxxxxxxxxxxx" <gregkh@xxxxxxxxxxxxxxxxxxx>; "john.stultz@xxxxxxxxxx" <john.stultz@xxxxxxxxxx>; "rebecca@xxxxxxxxxxx" <rebecca@xxxxxxxxxxx>; "devel@xxxxxxxxxxxxxxxxxxxx" <devel@xxxxxxxxxxxxxxxxxxxx>; "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>; IQBAL SHAREEF <iqbal.ams@xxxxxxxxxxx>; "pintu_agarwal@xxxxxxxxx" <pintu_agarwal@xxxxxxxxx>; Vishnu Pratap Singh <vishnu.ps@xxxxxxxxxxx>; "cpgs@xxxxxxxxxxx" <cpgs@xxxxxxxxxxx>
>> Sent: Monday, 6 October 2014 11:01 PM
>> Subject: Re: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order
>>
>> On Mon, Oct 6, 2014 at 9:26 AM, PINTU KUMAR <pintu.k@xxxxxxxxxxx> wrote:
>>
>>>
>>> Hi,
>>> >________________________________
>>> > From: Laura Abbott <lauraa@xxxxxxxxxxxxxx>
>>> >To: Heesub Shin <heesub.shin@xxxxxxxxxxx>; Pintu Kumar
>> <pintu.k@xxxxxxxxxxx>; akpm@xxxxxxxxxxxxxxxxxxxx;
>> gregkh@xxxxxxxxxxxxxxxxxxx; john.stultz@xxxxxxxxxx; rebecca@xxxxxxxxxxx;
>> ccross@xxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
>>> >Cc: iqbal.ams@xxxxxxxxxxx; pintu_agarwal@xxxxxxxxx;
>> vishnu.ps@xxxxxxxxxxx
>>> >Sent: Monday, 6 October 2014 7:37 PM
>>> >Subject: Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER
>> for high order
>>> >
>>> >
>>> >On 10/6/2014 3:27 AM, Heesub Shin wrote:
>>> >
>>> >
>>> >
>>> >
>>> >> Hello Kumar,
>>> >>
>>> >> On 10/06/2014 05:31 PM, Pintu Kumar wrote:
>>> >>> The Android ion_system_heap uses allocation fallback mechanism
>>> >>> based on 8,4,0 order pages available in the system.
>>> >>> It changes gfp flags based on higher order allocation request.
>>> >>> This higher order value is hard-coded as 4, instead of using
>>> >>> the system defined higher order value.
>>> >>> Thus replacing this hard-coded value with
>> PAGE_ALLOC_COSTLY_ORDER
>>> >>> which is defined as 3.
>>> >>> This will help mapping the higher order request in system heap
>> with
>>> >>> the actual allocation request.
>>> >>
>>> >> Quite reasonable.
>>> >>
>>> >> Reviewed-by: Heesub Shin <heesub.shin@xxxxxxxxxxx>
>>> >>
>>> >> BTW, Anyone knows how the allocation order (8,4 and 0) was
>> decided? I
>>> >> think only Google guys might know the answer.
>>> >>
>>> >> regards,
>>> >> heesub
>>> >>
>>> >
>>> >My understanding was this was completely unrelated to the costly order
>>> >and was related to the page sizes corresponding to IOMMU page sizes
>>> >(1MB, 64K, 4K). This won't make a difference for the uncached page
>>> >pool case but for the not page pool case, I'm not sure if there
>> would
>>> >be a benefit for trying to get 32K pages with some effort vs. just
>>> >going back to 4K pages.
>>>
>>> No, it is not just related to IOMMU case. It comes into picture also for
>>> normal system-heap allocation (without iommu cases).
>>> Also, it is applicable for both uncached and page_pool cases.
>>> Please also check the changes under ion_system_heap_create.
>>> Here the gfp_flags are set under the pool structure.
>>> This value is used in ion_page_pool_alloc_pages.
>>> In both the cases, it internally calls alloc_pages, with this gfp_flags.
>>> Now, during memory pressure scenario, when alloc_pages moves to slowpath
>>> this gfp_flags will be used to decide allocation retry.
>>> In the current code, the higher-order flag is set only when order is
>> greater than 4.
>>> But, in MM, the order 4 is also considered as higher-order request.
>>> This higher-order is decided based on PAGE_ALLOC_COSTLY_ORDER (3) value.
>>> Hence, I think this value should be in sync with the MM code.
>>> >
>>> >Do you have any data/metrics that show a benefit from this patch?
>>> I think it is not related to any data or metrics.
>>> It is about replacing the hard-coded higher-order check to be in sync with
>>> the MM code.
>>>
>>
>> The selection of the orders used for allocation (8, then 4, then 0) is
>> designed to match with the sizes often found in IOMMUs, but this isn't
>> changing the order of the allocation, it is changing the GFP flags
>> used for the order 4 allocation. Right now we are using the
>> low_order_gfp_flags for order 4, this patch would change it to use
>> high_order_gfp_flags. We originally used low_order_gfp_flags here
>> because the MM subsystem can usually satisfy these allocations, and
>> the additional load placed on the MM subsystem to kick off kswapd to
>> free up more order 4 chunks is generally worth it. Using order 4
>> pages instead of order 0 pages can significantly improve the
>> performance of many IOMMUs by reducing TLB pressure and time spent
>> updating page tables. Unless you have data showing that this improves
>> something, and doesn't just cause all allocations to be order 0 when
>> under memory pressure, I don't suggest merging this.
>>
>
> Ok agree. It is worth retrying the allocation with order-4 pages.
> But, since 4 is considered higher order for MM and is greater than PAGE_ALLOC_COSTLY_ORDER.
> I guess the retrying will not happen, because of the following check in page_alloc:
> if (order > PAGE_ALLOC_COSTLY_ORDER)
> goto nopage;

It's not the retry that I'd be particularly worried about, it's
kicking off kswapd, which will run in the background to replenish the
pool of free order 4 pages. However, when
c654345924f7cce87bb221b89db91cba890421ba (mm: remove __GFP_NO_KSWAPD)
was merged (and then reverted, and then merged again), the
__GFP_NO_KSWAPD flag was removed from high_order_gfp_flags, and when
it was finally reverted again we did not re-add the flag, so we lost
that behavior long ago.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/