Re: [PATCH v11 net-next 05/11] net/nebula-matrix: add channel layer
From: Simon Horman
Date: Sun Apr 12 2026 - 08:59:41 EST
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/nebula-matrix: add channel layer
This commit adds a channel management layer for the nebula-matrix
network driver, providing mailbox-based communication between PF0
and other PFs with message send/receive, ACK handling, and
hash-based message handler lookup.
> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
[ ... ]
> +static int nbl_chan_send_msg(struct nbl_channel_mgt *chan_mgt,
> + struct nbl_chan_send_info *chan_send)
> +{
[ ... ]
> + if (test_bit(NBL_CHAN_INTERRUPT_READY, chan_info->state)) {
> + ret = wait_event_timeout(wait_head->wait_queue,
> + wait_head->acked,
> + NBL_CHAN_ACK_WAIT_TIME);
> + if (!ret) {
> + wait_head->status = NBL_MBX_STATUS_TIMEOUT;
> + dev_err(dev,
> + "Channel waiting ack failed, message type: %d, msg id: %u\n",
> + chan_send->msg_type, msgid.id);
> + return -ETIMEDOUT;
> + }
[ ... ]
> + /*polling wait mailbox ack*/
> + while (i--) {
> + nbl_chan_clean_queue(chan_mgt, chan_info);
> +
> + if (wait_head->acked) {
> + chan_send->ack_len = wait_head->ack_data_len;
> + wait_head->status = NBL_MBX_STATUS_IDLE;
> + return wait_head->ack_err;
> + }
> + usleep_range(NBL_CHAN_TX_WAIT_ACK_US_MIN,
> + NBL_CHAN_TX_WAIT_ACK_US_MAX);
> + }
> +
> + wait_head->status = NBL_MBX_STATUS_TIMEOUT;
> + dev_err(dev,
> + "Channel polling ack failed, message type: %d msg id: %u\n",
> + chan_send->msg_type, msgid.id);
> +
> + return -EFAULT;
^^^^^^
The interrupt-based ACK wait path above returns -ETIMEDOUT on timeout,
but this polling-based ACK wait path returns -EFAULT. -EFAULT ("bad
address") is not the right error code for a timeout condition.
Should this also return -ETIMEDOUT for consistency? Any caller that
checks specifically for -ETIMEDOUT to handle timeouts would miss
this polling timeout case.
> +}
[ ... ]
> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c
> --- /dev/null
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c
[ ... ]
> +int nbl_common_create_wq(void)
> +{
> + wq_mgt = kzalloc_obj(*wq_mgt);
> + if (!wq_mgt)
> + return -ENOMEM;
> +
> + wq_mgt->ctrl_dev_wq = alloc_workqueue("%s", WQ_MEM_RECLAIM | WQ_UNBOUND,
> + 0, "nbl_ctrldev_wq");
> + if (!wq_mgt->ctrl_dev_wq) {
> + pr_err("Failed to create workqueue nbl_ctrldev_wq\n");
> + goto alloc_ctrl_dev_wq_failed;
> + }
> +
> + return 0;
> +alloc_ctrl_dev_wq_failed:
> + kfree(wq_mgt);
> + return -ENOMEM;
> +}
When alloc_workqueue() fails, wq_mgt is freed but the static pointer
is not set to NULL. After this error path, wq_mgt becomes a dangling
pointer.
If nbl_common_queue_work() or nbl_common_destroy_wq() is called
afterwards, they dereference wq_mgt unconditionally:
nbl_common_queue_work():
queue_work(wq_mgt->ctrl_dev_wq, task);
nbl_common_destroy_wq():
destroy_workqueue(wq_mgt->ctrl_dev_wq);
kfree(wq_mgt);
Would adding "wq_mgt = NULL;" after kfree(wq_mgt) on the error path
prevent this?
[ ... ]
> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h
> --- /dev/null
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h
[ ... ]
> +/* it used for y_axis no necessay */
> +struct nbl_hash_entry_node {
> + struct hlist_node node;
> + void *key;
> + void *data;
> +};
This isn't a bug, but the comment "it used for y_axis no necessay"
appears garbled and "necessay" is misspelled. Could the comment be
rewritten to explain the purpose of this struct, or removed if it
is no longer relevant?