Re: Enabling peer to peer device transactions for PCIe devices

From: Logan Gunthorpe
Date: Tue Dec 06 2016 - 16:47:15 EST


Hey,

> Okay, so clearly this needs a kernel side NVMe specific allocator
> and locking so users don't step on each other..

Yup, ideally. That's why device dax isn't ideal for this application: it
doesn't provide any way to prevent users from stepping on each other.

> Or as Christoph says some kind of general mechanism to get these
> bounce buffers..

Yeah, I imagine a general allocate from BAR/region system would be very
useful.

> Ah, I see.
>
> As a first draft I'd stick with some kind of API built into the
> /dev/nvmeX that backs the filesystem. The user app would fstat the
> target file, open /dev/block/MAJOR(st_dev):MINOR(st_dev), do some
> ioctl to get a CMB mmap, and then proceed from there..
>
> When that is all working kernel-side, it would make sense to look at a
> more general mechanism that could be used unprivileged??

That makes a lot of sense to me. I suggested mmapping the char device
because it's really easy, but I can see that an ioctl on the block
device does seem more general and device agnostic.

> This is similar to the GPU issues too.. On NVMe you don't need to pin
> the pages, you just need to lock that VMA so it doesn't get freed from
> the NVMe CMB allocator while the IO is running...
> Probably in the long run the get_user_pages is going to have to be
> pushed down into drivers.. Future MMU coherent IO hardware also does
> not need the pinning or other overheads.

Yup. Yup.

Logan