Re: [PATCH net v2] octeontx2-pf: Fix page pool cache index corruption.

From: Simon Horman
Date: Thu Sep 07 2023 - 12:01:48 EST


On Thu, Sep 07, 2023 at 07:17:11AM +0530, Ratheesh Kannoth wrote:
> The access to page pool `cache' array and the `count' variable
> is not locked. Page pool cache access is fine as long as there
> is only one consumer per pool.
>
> octeontx2 driver fills in rx buffers from page pool in NAPI context.
> If system is stressed and could not allocate buffers, refiiling work
> will be delegated to a delayed workqueue. This means that there are
> two cosumers to the page pool cache.
>
> Either workqueue or IRQ/NAPI can be run on other CPU. This will lead
> to lock less access, hence corruption of cache pool indexes.
>
> To fix this issue, NAPI is rescheduled from workqueue context to refill
> rx buffers.
>
> Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool")
> Signed-off-by: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>
>
> ---
> ChangeLogs
> v1 -> v2: Added local_bh_disable() to be precise on napi scheduling.
>
> v0 -> v1: udelay will waste CPU cycles in UP. So call napi_schedule from
> delayed work queue context.
>
> ---
> ---
> .../ethernet/marvell/octeontx2/nic/cn10k.c | 4 +-
> .../ethernet/marvell/octeontx2/nic/cn10k.h | 2 +-
> .../marvell/octeontx2/nic/otx2_common.c | 43 +++----------------
> .../marvell/octeontx2/nic/otx2_common.h | 3 +-
> .../ethernet/marvell/octeontx2/nic/otx2_pf.c | 7 +--
> .../marvell/octeontx2/nic/otx2_txrx.c | 29 ++++++++++---
> .../marvell/octeontx2/nic/otx2_txrx.h | 4 +-
> 7 files changed, 42 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> index 826f691de259..211c7d8a0556 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn10k.c
> @@ -107,12 +107,13 @@ int cn10k_sq_aq_init(void *dev, u16 qidx, u16 sqb_aura)
> }
>
> #define NPA_MAX_BURST 16
> -void cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq)
> +int cn10k_refill_pool_ptrs(void *dev, struct otx2_cq_queue *cq)
> {
> struct otx2_nic *pfvf = dev;
> u64 ptrs[NPA_MAX_BURST];
> int num_ptrs = 1;
> dma_addr_t bufptr;
> + int cnt = cq->pool_ptrs;

nit: please arrange local variables in new Networking code in reverse xmas
tree order - longest line to shortest.

>
> /* Refill pool with new buffers */
> while (cq->pool_ptrs) {

...