Re: [!!Mass Mail KSE][MASSMAIL KLMS] Re: [RFC PATCH v1 0/7] virtio/vsock: introduce MSG_EOR flag for SEQPACKET

From: Arseny Krasnov
Date: Thu Aug 05 2021 - 05:22:05 EST



On 05.08.2021 12:06, Stefano Garzarella wrote:
> Caution: This is an external email. Be cautious while opening links or attachments.
>
>
>
> On Thu, Aug 05, 2021 at 11:33:12AM +0300, Arseny Krasnov wrote:
>> On 04.08.2021 15:57, Stefano Garzarella wrote:
>>> Caution: This is an external email. Be cautious while opening links or attachments.
>>>
>>>
>>>
>>> Hi Arseny,
>>>
>>> On Mon, Jul 26, 2021 at 07:31:33PM +0300, Arseny Krasnov wrote:
>>>> This patchset implements support of MSG_EOR bit for SEQPACKET
>>>> AF_VSOCK sockets over virtio transport.
>>>> Idea is to distinguish concepts of 'messages' and 'records'.
>>>> Message is result of sending calls: 'write()', 'send()', 'sendmsg()'
>>>> etc. It has fixed maximum length, and it bounds are visible using
>>>> return from receive calls: 'read()', 'recv()', 'recvmsg()' etc.
>>>> Current implementation based on message definition above.
>>> Okay, so the implementation we merged is wrong right?
>>> Should we disable the feature bit in stable kernels that contain it? Or
>>> maybe we can backport the fixes...
>> Hi,
>>
>> No, this is correct and it is message boundary based. Idea of this
>> patchset is to add extra boundaries marker which i think could be
>> useful when we want to send data in seqpacket mode which length
>> is bigger than maximum message length(this is limited by transport).
>> Of course we can fragment big piece of data too small messages, but
>> this
>> requires to carry fragmentation info in data protocol. So In this case
>> when we want to maintain boundaries receiver calls recvmsg() until
>> MSG_EOR found.
>> But when receiver knows, that data is fit in maximum datagram length,
>> it doesn't care about checking MSG_EOR just calling recv() or
>> read()(e.g.
>> message based mode).
> I'm not sure we should maintain boundaries of multiple send(), from
> POSIX standard [1]:

Yes, but also from POSIX: such calls like send() and sendmsg()

operates with "message" and if we check recvmsg() we will

find the following thing:


For message-based sockets, such as SOCK_DGRAM and SOCK_SEQPACKET, the entire

message shall be read in a single operation. If a message is too long to fit in the supplied

buffers, and MSG_PEEK is not set in the flags argument, the excess bytes shall be discarded.


I understand this, that send() boundaries also must be maintained.

I've checked SEQPACKET in AF_UNIX and AX_25 - both doesn't support

MSG_EOR, so send() boundaries must be supported.

>
> SOCK_SEQPACKET
> Provides sequenced, reliable, bidirectional, connection-mode
> transmission paths for records. A record can be sent using one or
> more output operations and received using one or more input
> operations, but a single operation never transfers part of more than
> one record. Record boundaries are visible to the receiver via the
> MSG_EOR flag.
>
> From my understanding a record could be sent with multiple send() and
> received, for example, with a single recvmsg().
> The only boundary should be the MSG_EOR flag set by the user on the last
> send() of a record.
You are right, if we talking about "record".
>
> From send() description [2]:
>
> MSG_EOR
> Terminates a record (if supported by the protocol).
>
> From recvmsg() description [3]:
>
> MSG_EOR
> End-of-record was received (if supported by the protocol).
>
> Thanks,
> Stefano
>
> [1]
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html
> [2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html
> [3]
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/recvmsg.html

P.S.: seems SEQPACKET is too exotic thing that everyone implements it in

own manner, because i've tested SCTP seqpacket implementation, and found

that:

1) It doesn't support MSG_EOR bit at send side, but uses MSG_EOR at receiver

side to mark MESSAGE boundary.

2) According POSIX any extra bytes that didn't fit in user's buffer must be dropped,

but SCTP doesn't drop it - you can read rest of datagram in next calls.

>
>