Re: [PATCH net-next v2 1/3] page_pool: unify frag page and non-frag page handling

From: Alexander H Duyck
Date: Tue May 30 2023 - 11:08:22 EST


On Mon, 2023-05-29 at 17:28 +0800, Yunsheng Lin wrote:
> Currently page_pool_dev_alloc_pages() can not be called
> when PP_FLAG_PAGE_FRAG is set, because it does not use
> the frag reference counting.
>
> As we are already doing a optimization by not updating
> page->pp_frag_count in page_pool_defrag_page() for the
> last frag user, and non-frag page only have one user,
> so we utilize that to unify frag page and non-frag page
> handling, so that page_pool_dev_alloc_pages() can also
> be called with PP_FLAG_PAGE_FRAG set.
>
> Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> CC: Lorenzo Bianconi <lorenzo@xxxxxxxxxx>
> CC: Alexander Duyck <alexander.duyck@xxxxxxxxx>

I"m not really a huge fan of the approach. Basically it looks like you
are trying to turn every page pool page into a fragmented page. Why not
just stick to keeping the fragemented pages and have a special case
that just generates a mono-frag page for your allocator instead.

The problem is there are some architectures where we just cannot
support having pp_frag_count due to the DMA size. So it makes sense to
leave those with just basic page pool instead of trying to fake that it
is a fragmented page.

> ---
> include/net/page_pool.h | 38 +++++++++++++++++++++++++++++++-------
> net/core/page_pool.c | 1 +
> 2 files changed, 32 insertions(+), 7 deletions(-)
>
> diff --git a/include/net/page_pool.h b/include/net/page_pool.h
> index c8ec2f34722b..ea7a0c0592a5 100644
> --- a/include/net/page_pool.h
> +++ b/include/net/page_pool.h
> @@ -50,6 +50,9 @@
> PP_FLAG_DMA_SYNC_DEV |\
> PP_FLAG_PAGE_FRAG)
>
> +#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \
> + (sizeof(dma_addr_t) > sizeof(unsigned long))
> +
> /*
> * Fast allocation side cache array/stack
> *
> @@ -295,13 +298,20 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
> */
> static inline void page_pool_fragment_page(struct page *page, long nr)
> {
> - atomic_long_set(&page->pp_frag_count, nr);
> + if (!PAGE_POOL_DMA_USE_PP_FRAG_COUNT)
> + atomic_long_set(&page->pp_frag_count, nr);
> }
>
> +/* We need to reset frag_count back to 1 for the last user to allow
> + * only one user in case the page is recycled and allocated as non-frag
> + * page.
> + */
> static inline long page_pool_defrag_page(struct page *page, long nr)
> {
> long ret;
>
> + BUILD_BUG_ON(__builtin_constant_p(nr) && nr != 1);
> +

What is the point of this line? It doesn't make much sense to me. Are
you just trying to force an optiimization? You would be better off just
taking the BUILD_BUG_ON contents and feeding them into an if statement
below since the statement will compile out anyway.

It seems like what you would want here is:
BUG_ON(!PAGE_POOL_DMA_USE_PP_FRAG_COUNT);

Otherwise you are potentially writing to a variable that shouldn't
exist.

> /* If nr == pp_frag_count then we have cleared all remaining
> * references to the page. No need to actually overwrite it, instead
> * we can leave this to be overwritten by the calling function.
> @@ -311,19 +321,36 @@ static inline long page_pool_defrag_page(struct page *page, long nr)
> * especially when dealing with a page that may be partitioned
> * into only 2 or 3 pieces.
> */
> - if (atomic_long_read(&page->pp_frag_count) == nr)
> + if (atomic_long_read(&page->pp_frag_count) == nr) {
> + /* As we have ensured nr is always one for constant case
> + * using the BUILD_BUG_ON() as above, only need to handle
> + * the non-constant case here for frag count draining.
> + */
> + if (!__builtin_constant_p(nr))
> + atomic_long_set(&page->pp_frag_count, 1);
> +
> return 0;
> + }
>
> ret = atomic_long_sub_return(nr, &page->pp_frag_count);
> WARN_ON(ret < 0);
> +
> + /* Reset frag count back to 1, this should be the rare case when
> + * two users call page_pool_defrag_page() currently.
> + */
> + if (!ret)
> + atomic_long_set(&page->pp_frag_count, 1);
> +
> return ret;
> }
>
> static inline bool page_pool_is_last_frag(struct page_pool *pool,
> struct page *page)
> {
> - /* If fragments aren't enabled or count is 0 we were the last user */
> - return !(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
> + /* When dma_addr_upper is overlapped with pp_frag_count
> + * or we were the last page frag user.
> + */
> + return PAGE_POOL_DMA_USE_PP_FRAG_COUNT ||
> (page_pool_defrag_page(page, 1) == 0);
> }
>
> @@ -357,9 +384,6 @@ static inline void page_pool_recycle_direct(struct page_pool *pool,
> page_pool_put_full_page(pool, page, true);
> }
>
> -#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \
> - (sizeof(dma_addr_t) > sizeof(unsigned long))
> -
> static inline dma_addr_t page_pool_get_dma_addr(struct page *page)
> {
> dma_addr_t ret = page->dma_addr;
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index e212e9d7edcb..0868aa8f6323 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -334,6 +334,7 @@ static void page_pool_set_pp_info(struct page_pool *pool,
> {
> page->pp = pool;
> page->pp_magic |= PP_SIGNATURE;
> + page_pool_fragment_page(page, 1);
> if (pool->p.init_callback)
> pool->p.init_callback(page, pool->p.init_arg);
> }

Again, you are adding statements here that now have to be stripped out
under specific circumstances. In my opinion it would be better to not
modify base page pool pages and instead just have your allocator
provide a 1 frag page pool page via a special case allocator rather
then messing with all the other drivers.