[RFC PATCH v9 00/19] virtio/vsock: introduce SOCK_SEQPACKET support

From: Arseny Krasnov
Date: Sat May 08 2021 - 12:32:04 EST

This patchset implements support of SOCK_SEQPACKET for virtio
As SOCK_SEQPACKET guarantees to save record boundaries, so to
do it, new bit for field 'flags' was added: SEQ_EOR. This bit is
set to 1 in last RW packet of message.
Now as packets of one socket are not reordered neither on vsock
nor on vhost transport layers, such bit allows to restore original
message on receiver's side. If user's buffer is smaller than message
length, when all out of size data is dropped.
Maximum length of datagram is not limited as in stream socket,
because same credit logic is used. Difference with stream socket is
that user is not woken up until whole record is received or error
occurred. Implementation also supports 'MSG_TRUNC' flags.
Tests also implemented.

Thanks to stsp2@xxxxxxxxx for encouragements and initial design

Arseny Krasnov (19):
af_vsock: update functions for connectible socket
af_vsock: separate wait data loop
af_vsock: separate receive data loop
af_vsock: implement SEQPACKET receive loop
af_vsock: implement send logic for SEQPACKET
af_vsock: rest of SEQPACKET support
af_vsock: update comments for stream sockets
virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
virtio/vsock: simplify credit update function API
virtio/vsock: defines and constants for SEQPACKET
virtio/vsock: dequeue callback for SOCK_SEQPACKET
virtio/vsock: add SEQPACKET receive logic
virtio/vsock: rest of SOCK_SEQPACKET support
virtio/vsock: enable SEQPACKET for transport
vhost/vsock: enable SEQPACKET for transport
vsock/loopback: enable SEQPACKET for transport
vsock_test: add SOCK_SEQPACKET tests
virtio/vsock: update trace event for SEQPACKET
af_vsock: serialize writes to shared socket

drivers/vhost/vsock.c | 42 +-
include/linux/virtio_vsock.h | 9 +
include/net/af_vsock.h | 8 +
.../events/vsock_virtio_transport_common.h | 5 +-
include/uapi/linux/virtio_vsock.h | 9 +
net/vmw_vsock/af_vsock.c | 417 +++++++++++------
net/vmw_vsock/virtio_transport.c | 25 +
net/vmw_vsock/virtio_transport_common.c | 129 ++++-
net/vmw_vsock/vsock_loopback.c | 11 +
tools/testing/vsock/util.c | 32 +-
tools/testing/vsock/util.h | 3 +
tools/testing/vsock/vsock_test.c | 63 +++
12 files changed, 594 insertions(+), 159 deletions(-)

v8 -> v9:
General changelog:
- see per patch change log.

Per patch changelog:
see every patch after '---' line.

v7 -> v8:
General changelog:
- whole idea is simplified: channel now considered reliable,
so SEQ_BEGIN, SEQ_END, 'msg_len' and 'msg_id' were removed.
Only thing that is used to mark end of message is bit in
'flags' field of packet header: VIRTIO_VSOCK_SEQ_EOR. Packet
with such bit set to 1 means, that this is last packet of

- POSIX MSG_EOR support is removed, as there is no exact
description how it works.

- all changes to 'include/uapi/linux/virtio_vsock.h' moved
to dedicated patch, as these changes linked with patch to

- patch 'virtio/vsock: SEQPACKET feature bit support' now merged
to 'virtio/vsock: setup SEQPACKET ops for transport'.

- patch 'vhost/vsock: SEQPACKET feature bit support' now merged
to 'vhost/vsock: setup SEQPACKET ops for transport'.

Per patch changelog:
see every patch after '---' line.

v6 -> v7:
General changelog:
- virtio transport callback for message length now removed
from transport. Length of record is returned by dequeue

- function which tries to get message length now returns 0
when rx queue is empty. Also length of current message in
progress is set to 0, when message processed or error

- patches for virtio feature bit moved after patches with
transport ops.

Per patch changelog:
see every patch after '---' line.

v5 -> v6:
General changelog:
- virtio transport specific callbacks which send SEQ_BEGIN or
SEQ_END now hidden inside virtio transport. Only enqueue,
dequeue and record length callbacks are provided by transport.

- virtio feature bit for SEQPACKET socket support introduced:

- 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
'msg_id' and used as id.

Per patch changelog:
- 'af_vsock: separate wait data loop':
1) Commit message updated.
2) 'prepare_to_wait()' moved inside while loop(thanks to
Jorgen Hansen).
Marked 'Reviewed-by' with 1), but as 2) I removed R-b.

- 'af_vsock: separate receive data loop': commit message
Marked 'Reviewed-by' with that fix.

- 'af_vsock: implement SEQPACKET receive loop': style fixes.

- 'af_vsock: rest of SEQPACKET support':
1) 'module_put()' added when transport callback check failed.
2) Now only 'seqpacket_allow()' callback called to check
support of SEQPACKET by transport.

- 'af_vsock: update comments for stream sockets': commit message
Marked 'Reviewed-by' with that fix.

- 'virtio/vsock: set packet's type in send':
1) Commit message updated.
2) Parameter 'type' from 'virtio_transport_send_credit_update()'
also removed in this patch instead of in next.

- 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
related state wrapped to special struct.

- 'virtio/vsock: update trace event for SEQPACKET': format strings
now not broken by new lines.

v4 -> v5:
- patches reorganized:
1) Setting of packet's type in 'virtio_transport_send_pkt_info()'
is moved to separate patch.
2) Simplifying of 'virtio_transport_send_credit_update()' is
moved to separate patch and before main virtio/vsock patches.
- style problem fixed
- in 'af_vsock: separate receive data loop' extra 'release_sock()'
- added trace event fields for SEQPACKET
- in 'af_vsock: separate wait data loop':
1) 'vsock_wait_data()' removed 'goto out;'
2) Comment for invalid data amount is changed.
- in 'af_vsock: rest of SEQPACKET support', 'new_transport' pointer
check is moved after 'try_module_get()'
- in 'af_vsock: update comments for stream sockets', 'connect-oriented'
replaced with 'connection-oriented'
- in 'loopback/vsock: setup SEQPACKET ops for transport',
'loopback/vsock' replaced with 'vsock/loopback'

v3 -> v4:
- SEQPACKET specific metadata moved from packet header to payload
and called 'virtio_vsock_seq_hdr'
- record integrity check:
1) SEQ_END operation was added, which marks end of record.
2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
on every marker send.
- af_vsock.c: socket operations for STREAM and SEQPACKET call same
functions instead of having own "gates" differs only by names:
'vsock_seqpacket/stream_getsockopt()' now replaced with
- af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
record ready. There is no need to return number of copied bytes,
because case when record received successfully is checked at virtio
transport layer, when SEQ_END is processed. Also user doesn't need
number of copied bytes, because 'recv()' from SEQPACKET could return
error, length of users's buffer or length of whole record(both are
known in af_vsock.c).
- af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
to separate functions because now both called from several places.
- af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
if failed to use transport.
- tools/testing/vsock/vsock_test.c: rename tests

v2 -> v3:
- patches reorganized: split for prepare and implementation patches
- local variables are declared in "Reverse Christmas tree" manner
- virtio_transport_common.c: valid leXX_to_cpu() for vsock header
fields access
- af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
between stream and seqpacket sockets.
- af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
- af_vsock.c: 'vsock_wait_data()' refactored.

v1 -> v2:
- patches reordered: af_vsock.c related changes now before virtio vsock
- patches reorganized: more small patches, where +/- are not mixed
- tests for SOCK_SEQPACKET added
- all commit messages updated
- af_vsock.c: 'vsock_pre_recv_check()' inlined to
- af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
was not found
- virtio_transport_common.c: transport callback for seqpacket dequeue
- virtio_transport_common.c: simplified
- virtio_transport_common.c: send reset on socket and packet type

Signed-off-by: Arseny Krasnov <arseny.krasnov@xxxxxxxxxxxxx>