Re: [PATCH net-next v10 3/4] page_pool: support unlimited number of inflight pages

From: Yunsheng Lin
Date: Wed Mar 05 2025 - 07:19:53 EST

Next message: Andreas Hindborg: "Re: [PATCH 16/22] rust: pin-init: add `std` and `alloc` support from the user-space version"
Previous message: Werner Sembach: "Re: [PATCH v2 1/2] Input: atkbd - map F21 key to support touchpad toggle keys"
In reply to: Jesper Dangaard Brouer: "Re: [PATCH net-next v10 3/4] page_pool: support unlimited number of inflight pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2025/3/4 17:25, Jesper Dangaard Brouer wrote:

>> }
>> @@ -514,10 +518,42 @@ static struct page_pool_item *page_pool_fast_item_alloc(struct page_pool *pool)
>>       return llist_entry(first, struct page_pool_item, lentry);
>> }
>> +static struct page_pool_item *page_pool_slow_item_alloc(struct page_pool *pool)
>> +{
>> +    if (unlikely(!pool->slow_items.block ||
>> +             pool->slow_items.next_to_use >= ITEMS_PER_PAGE)) {
>> +        struct page_pool_item_block *block;
>> +        struct page *page;
>> +
>> +        page = alloc_pages_node(pool->p.nid, GFP_ATOMIC | __GFP_NOWARN |
>> +                    __GFP_ZERO, 0);
>> +        if (!page) {
>> +            alloc_stat_inc(pool, item_slow_failed);
>> +            return NULL;
>> +        }
>
> I'm missing a counter that I can use to monitor the rate of page
> allocations for these "item" block's.
> In production want to have a metric that shows me a sudden influx of
> that cause code to hit this "item_slow_alloc" case (inflight_slow_alloc)

It seems the 'item_fast_empty' stat added in patch 2 is the metric you
mention above? as we use those slow_items sequentially, the pages allocated
for slow_items can be calculated by 'item_fast_empty' / ITEMS_PER_PAGE.

>
> BTW should this be called "inflight_block" instead of "item_block"?
>
>
>> +
>> +        block = page_address(page);
>> +        block->pp = pool;
>> +        block->flags |= PAGE_POOL_SLOW_ITEM_BLOCK_BIT;
>> +        refcount_set(&block->ref, ITEMS_PER_PAGE);
>> +        pool->slow_items.block = block;
>> +        pool->slow_items.next_to_use = 0;
>> +
>> +        spin_lock_bh(&pool->item_lock);
>> +        list_add(&block->list, &pool->item_blocks);
>> +        spin_unlock_bh(&pool->item_lock);
>> +    }
>> +
>> +    return &pool->slow_items.block->items[pool->slow_items.next_to_use++];
>> +}

...

>> +static void __page_pool_slow_item_free(struct page_pool *pool,
>> +                       struct page_pool_item_block *block)
>> +{
>> +    spin_lock_bh(&pool->item_lock);
>> +    list_del(&block->list);
>> +    spin_unlock_bh(&pool->item_lock);
>> +
>> +    put_page(virt_to_page(block));
>
> Here again I'm missing a counter that I can use to monitor the rate of
> page free events.
>
> In production I want a metric (e.g inflight_slow_free_block) that
> together with "item_slow_alloc" (perhaps named
> inflight_slow_alloc_block), show me if this code path is creating churn,
> that I can correlate/explain some other influx event on the system.
>
> BTW subtracting these (alloc - free) counters gives us the memory used.

If I understand it correctly, the 'item_fast_empty' is something like
the 'alloc' mentioned above, let's discuss the 'free' below.

>
>> +}
>> +

...

>> }
>> +static int page_pool_nl_fill_item_mem_info(struct page_pool *pool,
>> +                       struct sk_buff *rsp)
>> +{
>> +    struct page_pool_item_block *block;
>> +    size_t resident = 0, used = 0;
>> +    int err;
>> +
>> +    spin_lock_bh(&pool->item_lock);
>> +
>> +    list_for_each_entry(block, &pool->item_blocks, list) {
>> +        resident += PAGE_SIZE;
>> +
>> +        if (block->flags & PAGE_POOL_SLOW_ITEM_BLOCK_BIT)
>> +            used += (PAGE_SIZE - sizeof(struct page_pool_item) *
>> +                 refcount_read(&block->ref));
>> +        else
>> +            used += PAGE_SIZE;
>> +    }
>> +
>> +    spin_unlock_bh(&pool->item_lock);
>
> Holding a BH spin_lock can easily create production issues.

The above is not only give us the total pages used for page_pool_item,
but also give us the fragmentation info for those pages too.
So it seems the BH spin_lock is needed if we want the fragmentation info?

And the 'free' memtioned above can be calculated by 'memory used' - 'alloc'.

> I worry how long time it will take to traverse these lists.

I wouldn't worry about that as it is not supposed to be a lot of pages
in those list, if it is, it seems it is something we should be fixing
by increasing the size of fast_item or by defragmenting the slow_item
pages.

>
> We (Cc Yan) are currently hunting down a number of real production issue
> due to different cases of control-path code querying the kernel that
> takes a _bh lock to read data, hurting the data-path processing.

I am not sure if the above is the control-path here, I would rather
treat it as the debug-path?

>
> If we had the stats counters, then this would be less work, right?

It depends on if we want the fragmentation info or not as mentioned
above.

>
>

Next message: Andreas Hindborg: "Re: [PATCH 16/22] rust: pin-init: add `std` and `alloc` support from the user-space version"
Previous message: Werner Sembach: "Re: [PATCH v2 1/2] Input: atkbd - map F21 key to support touchpad toggle keys"
In reply to: Jesper Dangaard Brouer: "Re: [PATCH net-next v10 3/4] page_pool: support unlimited number of inflight pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]