回复:[PATCH v13 net-next 05/11] net/nebula-matrix: add channel layer

From: Illusion Wang

Date: Fri May 08 2026 - 23:46:34 EST


>> +static struct nbl_common_wq_mgt *wq_mgt;
>> +
>> +void nbl_common_queue_work(struct work_struct *task)
>> +{
>> + queue_work(wq_mgt->ctrl_dev_wq, task);
>> +}
>> +
>> +void nbl_common_destroy_wq(void)
>> +{
>> + destroy_workqueue(wq_mgt->ctrl_dev_wq);
>> + kfree(wq_mgt);
>> + wq_mgt = NULL;
>> +}
>> +
>> +int nbl_common_create_wq(void)
>> +{
>> + wq_mgt = kzalloc_obj(*wq_mgt);
>> + if (!wq_mgt)
>> + return -ENOMEM;
>> +
>> + wq_mgt->ctrl_dev_wq = alloc_workqueue("%s", WQ_MEM_RECLAIM | WQ_UNBOUND,
>> + 0, "nbl_ctrldev_wq");
>> + if (!wq_mgt->ctrl_dev_wq) {
>> + pr_err("Failed to create workqueue nbl_ctrldev_wq\n");
>> + goto alloc_ctrl_dev_wq_failed;
>> + }
>> +
>> + return 0;
>> +alloc_ctrl_dev_wq_failed:
>> + kfree(wq_mgt);
>> + wq_mgt = NULL;
>> + return -ENOMEM;
>> +}
>Does this global singleton survive multiple adapter probes?
>wq_mgt is a file-scope static pointer. If nbl_common_create_wq() is
>called from a second probe, the prior wq_mgt is overwritten and the
>previous workqueue is leaked. On remove of the second device,
>nbl_common_destroy_wq() unconditionally calls destroy_workqueue() and
>kfree() and sets wq_mgt to NULL — so a later remove of the first
>device will dereference NULL in destroy_workqueue(wq_mgt->ctrl_dev_wq).
>nbl_common_queue_work() also dereferences wq_mgt with no NULL check.
>This patch does not wire up the callers, but the later series patch
>"net/nebula-matrix: add common dev start/stop operation"
>(1ff0ad206da3) does. Would refcounting create/destroy, or switching to
>per-adapter workqueues, address this?

In the last patch:
static int __init nbl_module_init(void)
{
int status;

status = nbl_common_create_wq();
if (status) {
pr_err("Failed to create wq, err = %d\n", status);
goto wq_create_failed;
}
status = pci_register_driver(&nbl_driver);
if (status) {
pr_err("Failed to register PCI driver, err = %d\n", status);
goto pci_register_driver_failed;
}

return 0;

pci_register_driver_failed:
nbl_common_destroy_wq();
wq_create_failed:
return status;
}

static void __exit nbl_module_exit(void)
{
pci_unregister_driver(&nbl_driver);
nbl_common_destroy_wq();
}

module_init(nbl_module_init);
module_exit(nbl_module_exit);

The function nbl_common_create_wq is only called within nbl_module_init, and
nbl_common_destroy_wq is only called once within nbl_module_exit. So, is the
AI's suggestion also incorrect in this case?
---illusion