On Wed, Nov 23, 2016 at 06:25:21PM -0700, Logan Gunthorpe wrote:
There are three cases to worry about:
On 23/11/16 02:55 PM, Jason Gunthorpe wrote:
I was only saying it wasn't general in that it wouldn't work for IBHow do you mean?Only ODP hardware allows changing the DMA address on the fly, and itI am aware of ODP but, noted by others, it doesn't provide a general
works at the page table level. We do not need special handling for
solution to the points above.
hardware that doesn't support ODP or other hardware that doesn't do
similar things (like an NVMe drive).
- Coherent long lived page table mirroring (RDMA ODP MR)
- Non-coherent long lived page table mirroring (RDMA MR)
- Short lived DMA mapping (everything else)
Like you say below we have to handle short lived in the usual way, and
that covers basically every device except IB MRs, including the
command queue on a NVMe drive.
any complex allocators (GPU or otherwise) should respect that. And thatYes, absolutely, some kind of page pinning like locking is a hard
seems like it should be the default way most of this works -- and I
think it wouldn't actually take too much effort to make it all work now
as is. (Our iopmem work is actually quite small and simple.)
Yeah, we've had RDMA and O_DIRECT transfers to PCIe backed ZONE_DEVICEIs there any progress on that?
memory working for some time. I'd say it's a good fit. The main question
we've had is how to expose PCIe bars to userspace to be used as MRs and
I still don't quite get what iopmem was about.. I thought the
objection to uncachable ZONE_DEVICE & DAX made sense, so running DAX
over iopmem and still ending up with uncacheable mmaps still seems
like a non-starter to me...
Serguei, what is your plan in GPU land for migration? Ie if I have a
CPU mapped page and the GPU moves it to VRAM, it becomes non-cachable
- do you still allow the CPU to access it? Or do you swap it back to
cachable memory if the CPU touches it?
One approach might be to mmap the uncachable ZONE_DEVICE memory and
mark it inaccessible to the CPU - DMA could still translate. If the
CPU needs it then the kernel migrates it to system memory so it
becomes cachable. ??