Yunsheng Lin <linyunsheng@xxxxxxxxxx> writes:
So I really don't see a way for this race to happen with correct usage
of the page_pool and NAPI APIs, which means there's no reason to make
the change you are proposing here.
I looked at one driver setting pp->napi, it seems the bnxt driver doesn't
seems to call page_pool_disable_direct_recycling() when unloading, see
bnxt_half_close_nic(), page_pool_disable_direct_recycling() seems to be
only called for the new queue_mgmt API:
/* rtnl_lock held, this call can only be made after a previous successful
* call to bnxt_half_open_nic().
*/
void bnxt_half_close_nic(struct bnxt *bp)
{
bnxt_hwrm_resource_free(bp, false, true);
bnxt_del_napi(bp); *----call napi del and rcu sync----*
bnxt_free_skbs(bp);
bnxt_free_mem(bp, true); *------call page_pool_destroy()----*
clear_bit(BNXT_STATE_HALF_OPEN, &bp->state);
}
Even if there is a page_pool_disable_direct_recycling() called between
bnxt_del_napi() and bnxt_free_mem(), the timing window still exist as
rcu sync need to be called after page_pool_disable_direct_recycling(),
it seems some refactor is needed for bnxt driver to reuse the rcu sync
from the NAPI API, in order to avoid calling the rcu sync for
page_pool_destroy().
Well, I would consider that usage buggy. A page pool object is created
with a reference to the napi struct; so the page pool should also be
destroyed (clearing its reference) before the napi memory is freed. I
guess this is not really documented anywhere, but it's pretty standard
practice to free objects in the opposite order of their creation.
So no, I don't think this is something that should be fixed on the page
pool side (and certainly not by adding another synchronize_rcu() call
per queue!); rather, we should fix the drivers that get this wrong (and
probably document the requirement a bit better).