Re: [PATCH net-next 2/2] page_pool: support non-frag page for page_pool_alloc_frag()

From: Yunsheng Lin
Date: Sat May 27 2023 - 05:51:48 EST


On 2023/5/26 23:16, Alexander Duyck wrote:
> On Fri, May 26, 2023 at 2:28 AM Yunsheng Lin <linyunsheng@xxxxxxxxxx> wrote:
>>
>> There is performance penalty with using page frag support when
>> user requests a larger frag size and a page only supports one
>> frag user, see [1].
>>
>> It seems like user may request different frag size depending
>> on the mtu and packet size, provide an option to allocate
>> non-frag page when a whole page is not able to hold two frags,
>> so that user has a unified interface for the memory allocation
>> with least memory utilization and performance penalty.
>>
>> 1. https://lore.kernel.org/netdev/ZEU+vospFdm08IeE@localhost.localdomain/
>>
>> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>> CC: Lorenzo Bianconi <lorenzo@xxxxxxxxxx>
>> CC: Alexander Duyck <alexander.duyck@xxxxxxxxx>
>
> The way I see it there are several problems with this approach.
>
> First, why not just increase the page order rather than trying to
> essentially make page_pool_alloc_frag into an analog for
> page_pool_alloc_pages? I know for the skb allocator we are working
> with an order 3 page. You could likely do something similar here to
> achieve the better performance you are looking for.

I suppose the "an order 3 page" refers to __page_frag_cache_refill()
trying to allocate an order 3 page, if it fails, then fail back to
order 0 page?

As page pool alloc and recycle page in order based on pool->alloc
and pool->ring, so we can not do the above failback trick for page
pool.

We could add different pool->alloc/pool->ring for different order
page, for example, pool->alloc_order_0/pool->ring_order_0 for order
0 page, and pool->alloc_order_3/pool->ring_order_3 for order 3 page,
but then it will be complicated and possibly more memory wasteful.

We would also create page pool with pool->p.order being 3, then
the optimization in this patch also apply.

>
> Second, I am not a fan of these changes as they seem to be wasteful
> for drivers that might make use of a mix of large and small

I suppose 'wasteful' refer to memory usage, right?
I am not sure how drivers that might make use a mix of large and
small yet, like mlx5 using page_pool_defrag_page() directly.
but if the PAGE_SIZE << pool->p.order is integral multiple of the frag,
then some waste seems to be unavoidable, does handling the frag count
in the driver save more memory than handling the frag count in page pool?
If not, handling the frag count in page pool seems more reasonable?

> allocations. If we aren't going to use fragments then we should
> probably just wrap the call to this function in an inline wrapper that
> checks the size and just automatically pulls the larger sizes off into
> the non-frag allocation path. Look at something such as
> __netdev_alloc_skb as an example.

Do you mean adding an new inline wrapper like below?

if (len > xxx)
return page_pool_alloc_pages(pool, gfp);
else
return page_pool_alloc_frag();

It seems the above is essentially the same as this patch does,
this patch does it by not introducing another interface and doing
some optimization. For example, for the above case with 'len > xxx'
in this patch, we still allow frag allocate if the remaining size of
the page is bigger than 'len', which I think it is a small optimization
for 64K page size or order 3 page.

Also, do you have some thing in mind above xxx here? As this patch
chooses the ((PAGE_SIZE << pool->p.order) / 2) as xxx.

> .
>