Re: [PATCH net] octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order

From: Ratheesh Kannoth

Date: Fri Jun 19 2026 - 05:46:10 EST

On 2026-06-19 at 14:43:41, Simon Horman (horms@xxxxxxxxxx) wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://netdev-ai.bots.linux.dev/sashiko/
> ---
> octeontx2-af: npc: cn20k: Fix subbank free list indexing for search order
>
> This change re-keys the xa_sb_free free list using each subbank's slot
> in subbank_srch_order[] rather than using subbank_srch_order[sb->idx],
> moving the population into a new helper called from npc_priv_init().
> A few comments inline.
>
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > index 354c4e881c6a..d38e848add93 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
> > @@ -3423,6 +3423,36 @@ static int npc_create_srch_order(int cnt)
> > return 0;
> > }
> >
> > +static int npc_subbanks_srch_oder_init(struct rvu *rvu)
> ^^^^
>
> [Low]
> Is "oder" intended here? The neighbouring identifiers
> npc_create_srch_order() and subbank_srch_order use "order", so this
> appears to be a typo that would be easier to fix before the symbol
> lands.
>
> > +{
> > + struct npc_subbank *sb;
> > + int sb_idx;
> > + int i, j;
> > + int rc;
> > +
> > + for (i = 0; i < npc_priv->num_subbanks; i++) {
> > + sb_idx = subbank_srch_order[i];
> > + sb = &npc_priv->sb[sb_idx];
> > + sb->arr_idx = i;
> > +
> > + dev_dbg(rvu->dev, "%s: sb->idx=%u sb->arr_idx=%u\n",
> > + __func__, sb->idx, sb->arr_idx);
> > +
> > + rc = xa_err(xa_store(&npc_priv->xa_sb_free, sb->arr_idx,
> > + xa_mk_value(sb->idx), GFP_KERNEL));
> > + if (rc) {
> > + dev_err(rvu->dev,
> > + "%s: xa_store(xa_sb_free) failed at slot %d (sb=%d): %d\n",
> > + __func__, i, sb_idx, rc);
> > + for (j = 0; j < i; j++)
> > + xa_erase(&npc_priv->xa_sb_free, j);
> > + return rc;
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
>
> [ ... ]
>
> > @@ -4727,6 +4748,10 @@ static int npc_priv_init(struct rvu *rvu)
> > for (i = 0, sb = npc_priv->sb; i < num_subbanks; i++, sb++)
> > npc_subbank_init(rvu, sb, i);
> >
> > + ret = npc_subbanks_srch_oder_init(rvu);
> > + if (ret)
> > + goto fail2;
> > +
>
> [Medium]
> Can this leak subbank_srch_order on the new failure path?
>
> npc_create_srch_order() allocates subbank_srch_order via kcalloc().
> Before this patch, the only way to reach fail2 was when
> npc_create_srch_order() itself failed and left subbank_srch_order
> NULL, so fail2 not freeing it was safe.
>
> With this new goto fail2, subbank_srch_order has already been
> allocated, but the fail2 label only frees npc_priv->sb and npc_priv:
>
> fail2:
> xa_destroy(&npc_priv->xa_sb_used);
> [...]
> kfree(npc_priv->sb);
> npc_priv->sb = NULL;
> fail1:
> kfree(npc_priv);
> npc_priv = NULL;
> return ret;
>
> A subsequent re-init would unconditionally reassign
> subbank_srch_order = kcalloc(...) in npc_create_srch_order(),
> permanently leaking the previous buffer.
>
> Should this path go to a new label that also kfree()s
> subbank_srch_order, or should fail2 free it?
>
> > /* Get number of pcifuncs in the system */
> > npc_priv->pf_cnt = npc_pcifunc_map_create(rvu);
> > npc_priv->xa_pf2idx_map = kcalloc(npc_priv->pf_cnt,
>
> [High]
> Can npc_priv_init() return 0 on a real failure here?
>
> After npc_subbanks_srch_oder_init() returns successfully, ret is 0.
> If the kcalloc() for npc_priv->xa_pf2idx_map fails right after, the
> code does goto fail3 which falls through fail2 and fail1 to
> return ret, propagating 0 to the caller:
>
> fail3:
> kfree(subbank_srch_order);
> subbank_srch_order = NULL;
>
> fail2:
> [...]
> fail1:
> kfree(npc_priv);
> npc_priv = NULL;
> return ret;
>
> The caller npc_cn20k_init() then sees err == 0 and continues into
> npc_setup_mcam_section(), which dereferences npc_priv (now NULL).
>
> Pre-patch the function ended in return -ENOMEM, so the fail3 path
> returned -ENOMEM unconditionally. Should ret be reset to -ENOMEM
> before goto fail3, or should a separate variable be used for the
> new helper's return value?
>
> > @@ -4760,7 +4785,7 @@ static int npc_priv_init(struct rvu *rvu)
> > fail1:
> > kfree(npc_priv);
> > npc_priv = NULL;
> > - return -ENOMEM;
> > + return ret;
> > }

ACK for all.

pw-bot: changes-requested