Re: [PATCH net] net: ethernet: ti: am65-cpsw: Fix NAPI registration sequence

From: Sverdlin, Alexander
Date: Tue Mar 11 2025 - 04:57:56 EST


Hi Siddharth!

On Tue, 2025-03-11 at 14:21 +0530, s-vadapalli@xxxxxx wrote:
> > > Registering the interrupts for TX or RX DMA Channels prior to registering
> > > their respective NAPI callbacks can result in a NULL pointer dereference.
> > > This is seen in practice as a random occurrence since it depends on the
> > > randomness associated with the generation of traffic by Linux and the
> > > reception of traffic from the wire.
> > >
> > > Fixes: 681eb2beb3ef ("net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path")
> >
> > The patch Vignesh mentions here...
> >
> > > Signed-off-by: Vignesh Raghavendra <vigneshr@xxxxxx>
> > > Co-developed-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>
> > > Signed-off-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>

...

> > > --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> > > +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> > > @@ -2314,6 +2314,9 @@ static int am65_cpsw_nuss_ndev_add_tx_napi(struct am65_cpsw_common *common)
> > >   hrtimer_init(&tx_chn->tx_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED);
> > >   tx_chn->tx_hrtimer.function = &am65_cpsw_nuss_tx_timer_callback;
> > >  
> > > + netif_napi_add_tx(common->dma_ndev, &tx_chn->napi_tx,
> > > +   am65_cpsw_nuss_tx_poll);
> > > +
> > >   ret = devm_request_irq(dev, tx_chn->irq,
> > >          am65_cpsw_nuss_tx_irq,
> > >          IRQF_TRIGGER_HIGH,
> > > @@ -2323,9 +2326,6 @@ static int am65_cpsw_nuss_ndev_add_tx_napi(struct am65_cpsw_common *common)
> > >   tx_chn->id, tx_chn->irq, ret);
> > >   goto err;
> > >   }
> > > -
> > > - netif_napi_add_tx(common->dma_ndev, &tx_chn->napi_tx,
> > > -   am65_cpsw_nuss_tx_poll);
> >
> > ... has accounted for the fact ..._napi_add_... happens after [possibly unsuccessful] request_irq,
> > please grep for "for (--i ;". Is it necessary to adjust both loops, in the below case too?
>
> Yes! The order within the cleanup path has to be reversed too i.e.

Not only reverting the order...
What I'm referring is: when requesting i-th IRQ fails there has been
i-th NAPI already added, but the cleanup loops start from [i-1]-th instance.
It looks like a potential leak to me...

> release IRQ first followed by deleting the NAPI callback. I assume that
> you are referring to the same. Please let me know otherwise. The diff
> corresponding to it is:
> ---------------------------------------------------------------------------------------------------
> diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> index d5291281c781..32c844816501 100644
> --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> @@ -2334,8 +2334,8 @@ static int am65_cpsw_nuss_ndev_add_tx_napi(struct am65_cpsw_common *common)
>         for (--i ; i >= 0 ; i--) {
>                 struct am65_cpsw_tx_chn *tx_chn = &common->tx_chns[i];
>
> -               netif_napi_del(&tx_chn->napi_tx);
>                 devm_free_irq(dev, tx_chn->irq, tx_chn);
> +               netif_napi_del(&tx_chn->napi_tx);
>         }
>
>         return ret;
> @@ -2592,8 +2592,8 @@ static int am65_cpsw_nuss_init_rx_chns(struct am65_cpsw_common *common)
>  err_flow:
>         for (--i; i >= 0 ; i--) {
>                 flow = &rx_chn->flows[i];
> -               netif_napi_del(&flow->napi_rx);
>                 devm_free_irq(dev, flow->irq, flow);
> +               netif_napi_del(&flow->napi_rx);
>         }
>
>  err:
> ---------------------------------------------------------------------------------------------------
> Based on your confirmation, I will implement the above and post the v2
> patch. Thank you for reviewing this patch and providing feedback.

--
Alexander Sverdlin
Siemens AG
www.siemens.com