[PATCH v4 0/2] nvme-rdma: parallelize I/O queue setup

From: Surabhi Gogte

Date: Sat Jun 27 2026 - 00:16:47 EST


This patch series parallelizes nvme-rdma connection and reconnection by
setting up I/O queues in parallel instead of sequentially. Allocation and
startup of each queue are combined into a single per-queue async work
item, so per-queue connection latency overlaps across all queues. This
matters most on high-core-count hosts with many I/O queues, where serial
setup dominates connect time.

Patch 1 is a preparatory refactor: nvme_rdma_alloc_queue() takes a queue
pointer so allocation and startup can be folded into a single async
worker.

Patch 2 contains the implementation for async setup of I/O queues.

Testing on a 64-core host with 64 I/O queues shows nvme-rdma connection
time reduced from ~1.4s to 416ms.

Signed-off-by: Surabhi Gogte <sgogte@xxxxxxxxxxxxxxx>
---
Changes from v3->v4:
- Fixed formatting.
- Replaced kmalloc_array with kmalloc_objs.

Changes from v2->v3:
- Split the series into two patches: extract the nvme_rdma_alloc_queue()
refactor into a separate preparatory patch.
- Replace the atomic error flag in struct nvme_rdma_ctrl with a per-work
nvme_rdma_setup_ctx { queue, err } struct.
- Fix formatting changes regarding line overflow indentation and nesting.

Changes from v1->v2:
- Remove separate workqueue and use the async API instead.

Previous versions:
v1: https://lore.kernel.org/all/20260529001354.1003640-1-sgogte@xxxxxxxxxxxxxxx/
v2: https://lore.kernel.org/all/20260604195321.2232838-1-sgogte@xxxxxxxxxxxxxxx/
v3: https://lore.kernel.org/all/20260625212722.1302344-1-sgogte@xxxxxxxxxxxxxxx/

Surabhi Gogte (2):
nvme-rdma: refactor nvme_rdma_alloc_queue() to take a queue pointer
nvme-rdma: parallelize I/O queue allocation and startup

drivers/nvme/host/rdma.c | 136 ++++++++++++++++++++++++---------------
1 file changed, 83 insertions(+), 53 deletions(-)

--
2.54.0