[PATCH v3 0/2] nvme-rdma: parallelize I/O queue setup
From: Surabhi Gogte
Date: Thu Jun 25 2026 - 17:27:43 EST
This patch series parallelizes nvme-rdma connection and reconnection by
setting up I/O queues in parallel instead of sequentially. Allocation and
startup of each queue are combined into a single per-queue async work
item, so per-queue connection latency overlaps across all queues. This
matters most on high-core-count hosts with many I/O queues, where serial
setup dominates connect time.
Patch 1 is a preparatory refactor: nvme_rdma_alloc_queue() takes a queue
pointer so allocation and startup can be folded into a single async
worker.
Patch 2 contains the implementation for async setup of I/O queues.
Testing on a 64-core host with 64 I/O queues shows nvme-rdma connection
time reduced from ~1.4s to 416ms.
Signed-off-by: Surabhi Gogte <sgogte@xxxxxxxxxxxxxxx>
---
Changes from v2->v3:
- Split the series into two patches: extract the nvme_rdma_alloc_queue()
refactor into a separate preparatory patch.
- Replace the atomic error flag in struct nvme_rdma_ctrl with a per-work
nvme_rdma_setup_ctx { queue, err } struct.
- Fix formatting changes regarding line overflow indentation and nesting.
Changes from v1->v2:
- Remove separate workqueue and use the async API instead.
Previous versions:
v1: https://lore.kernel.org/all/20260529001354.1003640-1-sgogte@xxxxxxxxxxxxxxx/
v2: https://lore.kernel.org/all/20260604195321.2232838-1-sgogte@xxxxxxxxxxxxxxx/
Surabhi Gogte (2):
nvme-rdma: refactor nvme_rdma_alloc_queue() to take a queue pointer
nvme-rdma: parallelize I/O queue allocation and startup
drivers/nvme/host/rdma.c | 135 ++++++++++++++++++++++++---------------
1 file changed, 82 insertions(+), 53 deletions(-)
--
2.54.0