This adds handling of MSG_ZEROCOPY flag on transmission path:
1) If this flag is set and zerocopy transmission is possible (enabled
in socket options and transport allows zerocopy), then non-linear
skb will be created and filled with the pages of user's buffer.
Pages of user's buffer are locked in memory by 'get_user_pages()'.
2) Replaces way of skb owning: instead of 'skb_set_owner_sk_safe()' it
calls 'skb_set_owner_w()'. Reason of this change is that
'__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc' of socket, so
to decrease this field correctly, proper skb destructor is needed:
'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'.
3) Adds new callback to 'struct virtio_transport': 'can_msgzerocopy'.
If this callback is set, then transport needs extra check to be able
to send provided number of buffers in zerocopy mode. Currently, the
only transport that needs this callback set is virtio, because this
transport adds new buffers to the virtio queue and we need to check,
that number of these buffers is less than size of the queue (it is
required by virtio spec). vhost and loopback transports don't need
this check.
Signed-off-by: Arseniy Krasnov <avkrasnov@xxxxxxxxxxxxxxxxx>
---
Changelog:
v5(big patchset) -> v1:
* Refactorings of 'if' conditions.
* Remove extra blank line.
* Remove 'frag_off' field unneeded init.
* Add function 'virtio_transport_fill_skb()' which fills both linear
and non-linear skb with provided data.
v1 -> v2:
* Use original order of last four arguments in 'virtio_transport_alloc_skb()'.
v2 -> v3:
* Add new transport callback: 'msgzerocopy_check_iov'. It checks that
provided 'iov_iter' with data could be sent in a zerocopy mode.
If this callback is not set in transport - transport allows to send
any 'iov_iter' in zerocopy mode. Otherwise - if callback returns 'true'
then zerocopy is allowed. Reason of this callback is that in case of
G2H transmission we insert whole skb to the tx virtio queue and such
skb must fit to the size of the virtio queue to be sent in a single
iteration (may be tx logic in 'virtio_transport.c' could be reworked
as in vhost to support partial send of current skb). This callback
will be enabled only for G2H path. For details pls see comment
'Check that tx queue...' below.
v3 -> v4:
* 'msgzerocopy_check_iov' moved from 'struct vsock_transport' to
'struct virtio_transport' as it is virtio specific callback and
never needed in other transports.
v4 -> v5:
* 'msgzerocopy_check_iov' renamed to 'can_msgzerocopy' and now it
uses number of buffers to send as input argument. I think there is
no need to pass iov to this callback (at least today, it is used only
by guest side of virtio transport), because the only thing that this
callback does is comparison of number of buffers to be inserted to
the tx queue and size of this queue.
* Remove any checks for type of current 'iov_iter' with payload (is it
'iovec' or 'ubuf'). These checks left from the earlier versions where I
didn't use already implemented kernel API which handles every type of
'iov_iter'.
v5 -> v6:
* Refactor 'virtio_transport_fill_skb()'.
* Add 'WARN_ON_ONCE()' and comment on invalid combination of destination
socket and payload in 'virtio_transport_alloc_skb()'.
v7 -> v8:
* Move '+1' addition from 'can_msgzerocopy' callback body to the caller.
This addition means packet header.
* In 'virtio_transport_can_zcopy()' rename 'max_to_send' argument to
'pkt_len'.
* Update commit message by adding details about new 'can_msgzerocopy'
callback.
* In 'virtio_transport_init_hdr()' move 'len' argument directly after
'info'.
* Add comment about processing last skb in tx loop.
* Update comment for 'can_msgzerocopy' callback for more details.
v8 -> v9:
* Return and update comment for 'virtio_transport_alloc_skb()'.
* Pass pointer to transport ops to 'virtio_transport_can_zcopy()',
this allows to use it directly without calling virtio_transport_get_ops()'.
* Remove redundant call for 'msg_data_left()' in 'virtio_transport_fill_skb()'.
* Do not pass 'struct vsock_sock*' to 'virtio_transport_alloc_skb()',
use same pointer from already passed 'struct virtio_vsock_pkt_info*'.
* Fix setting 'end of message' bit for SOCK_SEQPACKET (add call for
'msg_data_left()' == 0).
* Add 'zcopy' parameter to packet allocation trace event.
include/linux/virtio_vsock.h | 9 +
.../events/vsock_virtio_transport_common.h | 12 +-
net/vmw_vsock/virtio_transport.c | 32 +++
net/vmw_vsock/virtio_transport_common.c | 250 ++++++++++++++----
4 files changed, 241 insertions(+), 62 deletions(-)