Re: [PATCH net v2 1/2] page_pool: fix timing for checking and disabling napi_local
From: Yunsheng Lin
Date: Tue Oct 01 2024 - 21:53:18 EST
On 10/1/2024 7:30 PM, Paolo Abeni wrote:
...
@@ -828,6 +837,9 @@ void page_pool_put_unrefed_netmem(struct page_pool
*pool, netmem_ref netmem,
recycle_stat_inc(pool, ring_full);
page_pool_return_page(pool, netmem);
}
+
+ if (!allow_direct_orig)
+ rcu_read_unlock();
What about always acquiring the rcu lock? would that impact performances
negatively?
If not, I think it's preferable, as it would make static checker happy.
As mentioned in cover letter, the overhead is about ~2ns
I guess it is the 'if' checking before rcu_read_unlock that static
checker is not happy about, there is also a 'if' checking before
the 'destroy_lock' introduced in patch 2, maybe '__cond_acquires'
can be used to make static checker happy?
}
EXPORT_SYMBOL(page_pool_put_unrefed_netmem);
[...]
@@ -1121,6 +1140,12 @@ void page_pool_destroy(struct page_pool *pool)
return;
page_pool_disable_direct_recycling(pool);
+
+ /* Wait for the freeing side see the disabling direct recycling
setting
+ * to avoid the concurrent access to the pool->alloc cache.
+ */
+ synchronize_rcu();
When turning on/off a device with a lot of queues, the above could
introduce a lot of long waits under the RTNL lock, right?
What about moving the trailing of this function in a separate helper and
use call_rcu() instead?
For this patch, yes, it can be done.
But patch 2 also rely on the rcu lock in this patch to ensure that free
side is synchronized with the destroy path, and the dma mapping done for
the inflight pages in page_pool_item_uninit() can not be done in
call_rcu(), as the driver might have unbound when RCU callback is
called, which might defeat the purpose of patch 2.
Maybe an optimization here is to only call synchronize_rcu() when there
are some inflight pages and pool->dma_map is set.