Re: [PATCH net-next v7 4/4] vsock/virtio: MSG_ZEROCOPY flag support

From: Stefano Garzarella
Date: Mon Sep 04 2023 - 04:22:52 EST


On Sun, Sep 03, 2023 at 11:13:23AM +0300, Arseniy Krasnov wrote:


On 01.09.2023 15:30, Stefano Garzarella wrote:
On Sun, Aug 27, 2023 at 11:54:36AM +0300, Arseniy Krasnov wrote:
This adds handling of MSG_ZEROCOPY flag on transmission path: if this
flag is set and zerocopy transmission is possible (enabled in socket
options and transport allows zerocopy), then non-linear skb will be
created and filled with the pages of user's buffer. Pages of user's
buffer are locked in memory by 'get_user_pages()'. Second thing that
this patch does is replace type of skb owning: instead of calling
'skb_set_owner_sk_safe()' it calls 'skb_set_owner_w()'. Reason of this
change is that '__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc'
of socket, so to decrease this field correctly proper skb destructor is
needed: 'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'.

Signed-off-by: Arseniy Krasnov <avkrasnov@xxxxxxxxxxxxxxxxx>

[...]


-/* Returns a new packet on success, otherwise returns NULL.
- *
- * If NULL is returned, errp is set to a negative errno.
- */
-static struct sk_buff *
-virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info,
-               size_t len,
-               u32 src_cid,
-               u32 src_port,
-               u32 dst_cid,
-               u32 dst_port)
-{
-    const size_t skb_len = VIRTIO_VSOCK_SKB_HEADROOM + len;
-    struct virtio_vsock_hdr *hdr;
-    struct sk_buff *skb;
+static bool virtio_transport_can_zcopy(struct virtio_vsock_pkt_info *info,
+                       size_t max_to_send)
                                              ^
I'd call it `pkt_len`, `max_to_send` is confusing IMHO. I didn't
initially if it was the number of buffers or bytes.

+{
+    const struct virtio_transport *t_ops;
+    struct iov_iter *iov_iter;
+
+    if (!info->msg)
+        return false;
+
+    iov_iter = &info->msg->msg_iter;
+
+    if (iov_iter->iov_offset)
+        return false;
+
+    /* We can't send whole iov. */
+    if (iov_iter->count > max_to_send)
+        return false;
+
+    /* Check that transport can send data in zerocopy mode. */
+    t_ops = virtio_transport_get_ops(info->vsk);
+
+    if (t_ops->can_msgzerocopy) {

So if `can_msgzerocopy` is not implemented, we always return true after
this point. Should we mention it in the .can_msgzerocopy documentation?

Ops, this is my mistake, I must return 'false' in this case. Seems I didn't
catch this problem with my tests, because there was no test case where
zerocopy will fallback to copy!

I'll fix it and add new test!

yep, I agree!



Can we also mention in the commit description why this is need only for
virtio_tranport and not for vhost and loopback?

+        int pages_in_iov = iov_iter_npages(iov_iter, MAX_SKB_FRAGS);
+        int pages_to_send = min(pages_in_iov, MAX_SKB_FRAGS);
+
+        return t_ops->can_msgzerocopy(pages_to_send);
+    }
+
+    return true;
+}
+

[...]

@@ -270,6 +395,17 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
            break;
        }

+        /* This is last skb to send this portion of data. */

Sorry I didn't get it :-(

Can you elaborate this a bit more?

I mean that we iterate over user's buffer here, allocating skb on each
iteration. And for last skb for this buffer we initialize completion
for user (we need to allocate one completion for one syscall).

Okay, so maybe we should explain better also in the code comment.

Thanks for review, I'll fix all other comments and resend patchset when
'net-next' will be opened again.

Cool, thanks!
Stefano