[PATCH 4/5] RDMA/mlx4: WQ_PERCPU added to alloc_workqueue users

From: Marco Crivellari
Date: Sat Nov 01 2025 - 12:37:35 EST


Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.

alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.

This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.

This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.

With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.

Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.

CC: Yishai Hadas <yishaih@xxxxxxxxxx>
Suggested-by: Tejun Heo <tj@xxxxxxxxxx>
Signed-off-by: Marco Crivellari <marco.crivellari@xxxxxxxx>
---
drivers/infiniband/hw/mlx4/cm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index 12b481d138cf..03aacd526860 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -591,7 +591,7 @@ void mlx4_ib_cm_paravirt_clean(struct mlx4_ib_dev *dev, int slave)

int mlx4_ib_cm_init(void)
{
- cm_wq = alloc_workqueue("mlx4_ib_cm", 0, 0);
+ cm_wq = alloc_workqueue("mlx4_ib_cm", WQ_PERCPU, 0);
if (!cm_wq)
return -ENOMEM;

--
2.51.0