I hadn't done this yet but I think a simple closest device in the tree
would solve the issue sufficiently. However, I originally had it so the
user has to pick the device and I prefer that approach. But if the user
picks the device, then why bother restricting what he picks?
Per the
thread with Sinan, I'd prefer to use what the user picks. You were one
of the biggest opponents to that so I'd like to hear your opinion on
removing the restrictions.
Ideally, we'd want to use an NVME CMB buffer as p2p memory. This would
save an extra PCI transfer as the NVME card could just take the data
out of it's own memory. However, at this time, cards with CMB buffers
don't seem to be available.
Even if it was available, it would be hard to make real use of this
given that we wouldn't know how to pre-post recv buffers (for in-capsule
data). But let's leave this out of the scope entirely...
I don't understand what you're referring to. We'd simply use the CMB
buffer as a p2pmem device, why does that change anything?
Why do you need this? you have a reference to the
queue itself.
This keeps track of whether the response was actually allocated with
p2pmem or not. It's needed for when we free the SGL because the queue
may have a p2pmem device assigned to it but, if the alloc failed and it
fell back on system memory then we need to know how to free it. I'm
currently looking at having SGLs having an iomem flag. In which case,
this would no longer be needed as the flag in the SGL could be used.
This is a problem. namespaces can be added at any point in time. No one
guarantee that dma_devs are all the namepaces we'll ever see.
Yeah, well restricting p2pmem based on all the devices in use is hard.
So we'd need a call into the transport every time an ns is added and
we'd have to drop the p2pmem if they add one that isn't supported. This
complexity is just one of the reasons I prefer just letting the user chose.
+
+ if (queue->p2pmem)
+ pr_debug("using %s for rdma nvme target queue",
+ dev_name(&queue->p2pmem->dev));
+
+ kfree(dma_devs);
+}
+
static int nvmet_rdma_queue_connect(struct rdma_cm_id *cm_id,
struct rdma_cm_event *event)
{
@@ -1199,6 +1271,8 @@ static int nvmet_rdma_queue_connect(struct
rdma_cm_id *cm_id,
}
queue->port = cm_id->context;
+ nvmet_rdma_queue_setup_p2pmem(queue);
+
Why is all this done for each queue? looks completely redundant to me.
A little bit. Where would you put it?
ret = nvmet_rdma_cm_accept(cm_id, queue, &event->param.conn);
if (ret)
goto release_queue;
You seemed to skip the in-capsule buffers for p2pmem (inline_page), I'm
curious why?
Yes, the thinking was that these transfers were small anyway so there
would not be significant benefit to pushing them through p2pmem. There's
really no reason why we couldn't do that if it made sense to though.