Re: [PATCH v2] nvme-rdma: parallelize I/O queue allocation and startup

Next message: Michael S. Tsirkin: "Re: [PATCH splitout] mm: page_reporting: allow driver to set batch capacity"
Previous message: Christian Loehle: "Re: [PATCH v4 0/6] sched: Fix cluster scheduling in the presence of asymmetric capacity"
Next in thread: Christoph Hellwig: "Re: [PATCH v2] nvme-rdma: parallelize I/O queue allocation and startup"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Keith Busch

Date: Tue Jun 09 2026 - 16:09:49 EST

On Thu, Jun 04, 2026 at 01:53:21PM -0600, Surabhi Gogte wrote:
> Refactor nvme rdma I/O queue setup to use async API, combining
> allocation and startup into a single parallel operation per queue. This
> reduces connection and reconnection setup time when there are delays in
> establishing connections, which is especially important for
> high-core-count hosts.

Mostly looks fine.

> @@ -16,6 +16,7 @@
> #include <linux/types.h>
> #include <linux/list.h>
> #include <linux/mutex.h>
> +#include <linux/async.h>
> #include <linux/scatterlist.h>
> #include <linux/nvme.h>
> #include <linux/unaligned.h>
> @@ -125,6 +126,7 @@ struct nvme_rdma_ctrl {
> struct nvme_ctrl ctrl;
> bool use_inline_data;
> u32 io_queues[HCTX_MAX_TYPES];
> + atomic_t qsetup_err;
> };

This new field serves only to propogate an error from a local context,
so I don't want to introduce a new field for it at this scope. I prefer
you declare a special context struct for it to use with the async usage:

struct nvme_rdma_setup_ctx {
struct nvme_rdma_queue *queue;
int *err;
};

And then make that the cookie passed to the async setup. I don't think
it needs to be atomic here either: we really don't care if we see the
first or last error, so forcing a cmpxchg for the first one is a bit
overkill; you can just do READ/WRITE_ONCE instead and accept the race.