On Tue, Nov 7, 2023 at 2:55 PM David Ahern <dsahern@xxxxxxxxxx> wrote:That would be a Frankenstein's monster api with no good reason for it.
On 11/7/23 3:10 PM, Mina Almasry wrote:
On Mon, Nov 6, 2023 at 3:44 PM David Ahern <dsahern@xxxxxxxxxx> wrote:
On 11/5/23 7:44 PM, Mina Almasry wrote:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index eeeda849115c..1c351c138a5b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -843,6 +843,9 @@ struct netdev_dmabuf_binding {
};
#ifdef CONFIG_DMA_SHARED_BUFFER
+struct page_pool_iov *
+netdev_alloc_devmem(struct netdev_dmabuf_binding *binding);
+void netdev_free_devmem(struct page_pool_iov *ppiov);
netdev_{alloc,free}_dmabuf?
Can do.
I say that because a dmabuf can be host memory, at least I am not aware
of a restriction that a dmabuf is device memory.
In my limited experience dma-buf is generally device memory, and
that's really its use case. CONFIG_UDMABUF is a driver that mocks
dma-buf with a memfd which I think is used for testing. But I can do
the rename, it's more clear anyway, I think.
config UDMABUF
bool "userspace dmabuf misc driver"
default n
depends on DMA_SHARED_BUFFER
depends on MEMFD_CREATE || COMPILE_TEST
help
A driver to let userspace turn memfd regions into dma-bufs.
Qemu can use this to create host dmabufs for guest framebuffers.
Qemu is just a userspace process; it is no way a special one.
Treating host memory as a dmabuf should radically simplify the io_uring
extension of this set.
I agree actually, and I was about to make that comment to David Wei's
series once I have the time.
David, your io_uring RX zerocopy proposal actually works with devmem
TCP, if you're inclined to do that instead, what you'd do roughly is
(I think):
- Allocate a memfd,
- Use CONFIG_UDMABUF to create a dma-buf out of that memfd.
- Bind the dma-buf to the NIC using the netlink API in this RFC.
- Your io_uring extensions and io_uring uapi should work as-is almost
on top of this series, I think.
If you do this the incoming packets should land into your memfd, which
may or may not work for you. In the future if you feel inclined to use
device memory, this approach that I'm describing here would be more
extensible to device memory, because you'd already be using dma-bufs
for your user memory; you'd just replace one kind of dma-buf (UDMABUF)
with another.
That the io_uring set needs to dive into
page_pools is just wrong - complicating the design and code and pushing
io_uring into a realm it does not need to be involved in.
Most (all?) of this patch set can work with any memory; only device
memory is unreadable.