Re: [PATCH V2 net] net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanup

From: Shradha Gupta
Date: Fri Aug 23 2024 - 23:52:34 EST


On Fri, Aug 23, 2024 at 02:44:29AM -0700, Souradeep Chakrabarti wrote:
> Currently napi_disable() gets called during rxq and txq cleanup,
> even before napi is enabled and hrtimer is initialized. It causes
> kernel panic.
>
> ? page_fault_oops+0x136/0x2b0
> ? page_counter_cancel+0x2e/0x80
> ? do_user_addr_fault+0x2f2/0x640
> ? refill_obj_stock+0xc4/0x110
> ? exc_page_fault+0x71/0x160
> ? asm_exc_page_fault+0x27/0x30
> ? __mmdrop+0x10/0x180
> ? __mmdrop+0xec/0x180
> ? hrtimer_active+0xd/0x50
> hrtimer_try_to_cancel+0x2c/0xf0
> hrtimer_cancel+0x15/0x30
> napi_disable+0x65/0x90
> mana_destroy_rxq+0x4c/0x2f0
> mana_create_rxq.isra.0+0x56c/0x6d0
> ? mana_uncfg_vport+0x50/0x50
> mana_alloc_queues+0x21b/0x320
> ? skb_dequeue+0x5f/0x80
>
> Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ")
> Signed-off-by: Souradeep Chakrabarti <schakrabarti@xxxxxxxxxxxxxxxxxxx>
> ---
> V2 -> V1:
> Addressed the comment on cleaning up napi for the queues,
> where queue creation was successful.
> ---
> drivers/net/ethernet/microsoft/mana/mana_en.c | 22 +++++++++++--------
> 1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 39f56973746d..7448085fd49e 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1872,10 +1872,11 @@ static void mana_destroy_txq(struct mana_port_context *apc)
>
> for (i = 0; i < apc->num_queues; i++) {
> napi = &apc->tx_qp[i].tx_cq.napi;
> - napi_synchronize(napi);
> - napi_disable(napi);
> - netif_napi_del(napi);
> -
> + if (napi->dev == apc->ndev) {
> + napi_synchronize(napi);
> + napi_disable(napi);
> + netif_napi_del(napi);
> + }
> mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
>
> mana_deinit_cq(apc, &apc->tx_qp[i].tx_cq);
> @@ -2023,14 +2024,17 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
>
> napi = &rxq->rx_cq.napi;
>
> - if (validate_state)
> - napi_synchronize(napi);
> + if (napi->dev == apc->ndev) {
>
> - napi_disable(napi);
> + if (validate_state)
> + napi_synchronize(napi);
>
> - xdp_rxq_info_unreg(&rxq->xdp_rxq);
> + napi_disable(napi);
>
> - netif_napi_del(napi);
> + netif_napi_del(napi);
> + }
> +
> + xdp_rxq_info_unreg(&rxq->xdp_rxq);
>
> mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);

Reviewed-by: Shradha Gupta <shradhagupta@xxxxxxxxxxxxxxxxxxx>
>
> --
> 2.34.1
>
>