[PATCH v3 0/2] nvme-rdma: parallelize I/O queue setup

From: Surabhi Gogte

Date: Thu Jun 25 2026 - 17:27:43 EST


This patch series parallelizes nvme-rdma connection and reconnection by
setting up I/O queues in parallel instead of sequentially. Allocation and
startup of each queue are combined into a single per-queue async work
item, so per-queue connection latency overlaps across all queues. This
matters most on high-core-count hosts with many I/O queues, where serial
setup dominates connect time.

Patch 1 is a preparatory refactor: nvme_rdma_alloc_queue() takes a queue
pointer so allocation and startup can be folded into a single async
worker.

Patch 2 contains the implementation for async setup of I/O queues.

Testing on a 64-core host with 64 I/O queues shows nvme-rdma connection
time reduced from ~1.4s to 416ms.

Signed-off-by: Surabhi Gogte <sgogte@xxxxxxxxxxxxxxx>
---
Changes from v2->v3:
- Split the series into two patches: extract the nvme_rdma_alloc_queue()
refactor into a separate preparatory patch.
- Replace the atomic error flag in struct nvme_rdma_ctrl with a per-work
nvme_rdma_setup_ctx { queue, err } struct.
- Fix formatting changes regarding line overflow indentation and nesting.

Changes from v1->v2:
- Remove separate workqueue and use the async API instead.

Previous versions:
v1: https://lore.kernel.org/all/20260529001354.1003640-1-sgogte@xxxxxxxxxxxxxxx/
v2: https://lore.kernel.org/all/20260604195321.2232838-1-sgogte@xxxxxxxxxxxxxxx/

Surabhi Gogte (2):
nvme-rdma: refactor nvme_rdma_alloc_queue() to take a queue pointer
nvme-rdma: parallelize I/O queue allocation and startup

drivers/nvme/host/rdma.c | 135 ++++++++++++++++++++++++---------------
1 file changed, 82 insertions(+), 53 deletions(-)

--
2.54.0