Re: [PATCH v14 net-next 11/11] net/nebula-matrix: add common dev start/stop operation

From: Jakub Kicinski

Date: Fri May 15 2026 - 21:18:25 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net/nebula-matrix: add common dev start/stop operation

This patch sets up common device start and stop operations. It configures the
MSI-X map table, allocates and enables MSI-X vectors, and registers the
mailbox ISR.

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dev.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dev.c
> index 9b7f4598fccc3..c7e632a16b40c 100644
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dev.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dev.c
[ ... ]
> @@ -79,6 +259,14 @@ static void nbl_dev_clean_mailbox_task(struct work_struct *work)
> NBL_CHAN_TYPE_MAILBOX);
> }
>
> +static int nbl_dev_clean_mailbox_schedule(struct nbl_dev_mgt *dev_mgt)
> +{
> + struct nbl_dev_common *common_dev = dev_mgt->common_dev;
> +
> + nbl_common_queue_work(&common_dev->clean_mbx_task);
> + return 0;
> +}
> +
> /* ---------- Dev init process ---------- */
> static int nbl_dev_setup_common_dev(struct nbl_adapter *adapter)
> {

Looking at nbl_dev_setup_common_dev() in the same file:

nbl_dev_setup_common_dev() {
...
ret = nbl_dev_setup_chan_queue(dev_mgt, NBL_CHAN_TYPE_MAILBOX);
if (ret)
return ret;

INIT_WORK(&common_dev->clean_mbx_task, nbl_dev_clean_mailbox_task);
common->vsi_id = disp_ops->get_vsi_id(priv, NBL_VSI_DATA);
if (common->vsi_id == U32_MAX)
return -ENOENT;
...
}

If operations like disp_ops->get_vsi_id() or disp_ops->get_eth_id() fail,
the function returns the error directly. Could this leak the mailbox queue
resources allocated by nbl_dev_setup_chan_queue() on these initialization
error paths?

Since dev_mgt->common_dev is only assigned at the end of the setup function,
it appears the caller nbl_dev_init() cannot clean this up either.

[ ... ]
> @@ -220,9 +408,41 @@ void nbl_dev_remove(struct nbl_adapter *adapter)
> /* ---------- Dev start process ---------- */
> int nbl_dev_start(struct nbl_adapter *adapter)
> {
> + struct nbl_dev_mgt *dev_mgt = adapter->core.dev_mgt;
> + int ret;
> +
> + ret = nbl_dev_configure_msix_map(dev_mgt);
> + if (ret)
> + goto config_msix_map_err;
> +
> + ret = nbl_dev_init_interrupt_scheme(dev_mgt);
> + if (ret)
> + goto init_interrupt_scheme_err;
> + ret = nbl_dev_request_mailbox_irq(dev_mgt);
> + if (ret)
> + goto mailbox_request_irq_err;
> + ret = nbl_dev_enable_mailbox_irq(dev_mgt);
> + if (ret)
> + goto enable_mailbox_irq_err;
> +
> return 0;
> +enable_mailbox_irq_err:
> + nbl_dev_disable_mailbox_irq(dev_mgt);
> + nbl_dev_free_mailbox_irq(dev_mgt);
> +mailbox_request_irq_err:
> + nbl_dev_clear_interrupt_scheme(dev_mgt);
> +init_interrupt_scheme_err:
> + nbl_dev_destroy_msix_map(dev_mgt);
> +config_msix_map_err:
> + return ret;
> }
>
> void nbl_dev_stop(struct nbl_adapter *adapter)
> {
> + struct nbl_dev_mgt *dev_mgt = adapter->core.dev_mgt;
> +
> + nbl_dev_disable_mailbox_irq(dev_mgt);
> + nbl_dev_free_mailbox_irq(dev_mgt);
> + nbl_dev_clear_interrupt_scheme(dev_mgt);
> + nbl_dev_destroy_msix_map(dev_mgt);
> }

When nbl_dev_stop() is called, or when nbl_dev_start() takes the error path
after enabling the mailbox IRQ, the driver disables and frees the interrupt,
but does not cancel clean_mbx_task.

If an interrupt fires just before the IRQ is disabled, the work remains
queued on wq_mgt->ctrl_dev_wq and will execute asynchronously.

Could this lead to a memory access issue if devres frees the common_dev
memory when the device unbinds? Is it necessary to call cancel_work_sync()
here to ensure any pending work is flushed or cancelled before the structure
is destroyed?