Re: [PATCH net v2] net: ethernet: ti: am65-cpsw: Fix NAPI registration sequence

From: Sverdlin, Alexander
Date: Tue Mar 11 2025 - 09:53:43 EST


Hi Siddharth!

On Tue, 2025-03-11 at 18:31 +0530, Siddharth Vadapalli wrote:
> From: Vignesh Raghavendra <vigneshr@xxxxxx>
>
> Registering the interrupts for TX or RX DMA Channels prior to registering
> their respective NAPI callbacks can result in a NULL pointer dereference.
> This is seen in practice as a random occurrence since it depends on the
> randomness associated with the generation of traffic by Linux and the
> reception of traffic from the wire.
>
> Fixes: 681eb2beb3ef ("net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path")
> Signed-off-by: Vignesh Raghavendra <vigneshr@xxxxxx>
> Co-developed-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@xxxxxx>

...

> v1 of this patch is at:
> https://lore.kernel.org/all/20250311061214.4111634-1-s-vadapalli@xxxxxx/
> Changes since v1:
> - Based on the feedback provided by Alexander Sverdlin <alexander.sverdlin@xxxxxxxxxxx>
>   the patch has been updated to account for the cleanup path in terms of an imbalance
>   between the number of successful netif_napi_add_tx/netif_napi_add calls and the
>   number of successful devm_request_irq() calls. In the event of an error, we will
>   always have one extra successful netif_napi_add_tx/netif_napi_add that needs to be
>   cleaned up before we clean an equal number of netif_napi_add_tx/netif_napi_add and
>   devm_request_irq.

...

> --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
> @@ -2569,6 +2570,9 @@ static int am65_cpsw_nuss_init_rx_chns(struct am65_cpsw_common *common)
>        HRTIMER_MODE_REL_PINNED);
>   flow->rx_hrtimer.function = &am65_cpsw_nuss_rx_timer_callback;
>  
> + netif_napi_add(common->dma_ndev, &flow->napi_rx,
> +        am65_cpsw_nuss_rx_poll);
> +
>   ret = devm_request_irq(dev, flow->irq,
>          am65_cpsw_nuss_rx_irq,
>          IRQF_TRIGGER_HIGH,
> @@ -2579,9 +2583,6 @@ static int am65_cpsw_nuss_init_rx_chns(struct am65_cpsw_common *common)
>   flow->irq = -EINVAL;
>   goto err_flow;
>   }
> -
> - netif_napi_add(common->dma_ndev, &flow->napi_rx,
> -        am65_cpsw_nuss_rx_poll);
>   }
>  
>   /* setup classifier to route priorities to flows */
> @@ -2590,10 +2591,11 @@ static int am65_cpsw_nuss_init_rx_chns(struct am65_cpsw_common *common)
>   return 0;
>  
>  err_flow:
> - for (--i; i >= 0 ; i--) {
> + netif_napi_del(&flow->napi_rx);

There are totally 3 "goto err_flow;" instances, so if k3_udma_glue_rx_flow_init() or
k3_udma_glue_rx_get_irq() would fail on the first iteration, we would come here without
a single call to netif_napi_add().

> + for (--i; i >= 0; i--) {
>   flow = &rx_chn->flows[i];
> - netif_napi_del(&flow->napi_rx);
>   devm_free_irq(dev, flow->irq, flow);
> + netif_napi_del(&flow->napi_rx);
>   }
>  
>  err:

--
Alexander Sverdlin
Siemens AG
www.siemens.com