Re: Enabling peer to peer device transactions for PCIe devices

From: Logan Gunthorpe
Date: Wed Nov 23 2016 - 20:25:32 EST




On 23/11/16 02:55 PM, Jason Gunthorpe wrote:
>>> Only ODP hardware allows changing the DMA address on the fly, and it
>>> works at the page table level. We do not need special handling for
>>> RDMA.
>>
>> I am aware of ODP but, noted by others, it doesn't provide a general
>> solution to the points above.
>
> How do you mean?

I was only saying it wasn't general in that it wouldn't work for IB
hardware that doesn't support ODP or other hardware that doesn't do
similar things (like an NVMe drive).

It makes sense for hardware that supports ODP to allow MRs to not pin
the underlying memory and provide for migrations that the hardware can
follow. But most DMA engines will require the memory to be pinned and
any complex allocators (GPU or otherwise) should respect that. And that
seems like it should be the default way most of this works -- and I
think it wouldn't actually take too much effort to make it all work now
as is. (Our iopmem work is actually quite small and simple.)

>> It's also worth noting that #4 makes use of ZONE_DEVICE (#2) so they are
>> really the same option. iopmem is really just one way to get BAR
>> addresses to user-space while inside the kernel it's ZONE_DEVICE.
>
> Seems fine for RDMA?

Yeah, we've had RDMA and O_DIRECT transfers to PCIe backed ZONE_DEVICE
memory working for some time. I'd say it's a good fit. The main question
we've had is how to expose PCIe bars to userspace to be used as MRs and
such.


Logan