Re: [RFC PATCH v1] vsock: check error queue to set EPOLLERR

From: Arseniy Krasnov
Date: Thu Mar 02 2023 - 10:09:22 EST




On 02.03.2023 16:38, Stefano Garzarella wrote:
> On Thu, Mar 02, 2023 at 02:41:29PM +0300, Arseniy Krasnov wrote:
>> Hello!
>>
>> On 02.03.2023 13:06, Stefano Garzarella wrote:
>>> On Wed, Mar 01, 2023 at 08:19:45AM +0300, Arseniy Krasnov wrote:
>>>> EPOLLERR must be set not only when there is error on the socket, but also
>>>> when error queue of it is not empty (may be it contains some control
>>>> messages). Without this patch 'poll()' won't detect data in error queue.
>>>
>>> Do you have a reproducer?
>>>
>> Dedicated reproducer - no:)
>> To reproduce this issue, i used last MSG_ZEROCOPY patches. Completion was inserted to
>> error queue, and 'poll()' didn't report about it. That was the reason, why this patch
>> was included to MSG_ZEROCOPY patchset. But also i think it is better to reduce number
>> of patches in it(i'm working on v2), so it is good to handle this patch separately.
>
> Yep, absolutely!
>
>> May be one way to reproduce it is use SO_TIMESTAMP(time info about skbuff will be queued
>> to the error queue). IIUC this feature is implemented at socket layer and may work in
>> vsock (but i'm not sure). Ok, i'll check it and try to implement reproducer.
>>
>> IIUC, for future, policy for fixes is "for each fix implement reproducer in vsock_test"?
>
> Nope, but for each fix we should have a Fixes tag.
>
> Usually we use vsock_test to check regressions on features and also the
> behaviour of different transports.
> My question was more about whether this problem was there before
> supporting sk_buff or not, to figure out which Fixes tag to use.
>
Ok i see
>>
>>>> This patch is based on 'tcp_poll()'.
>>>
>>> LGTM but we should add a Fixes tag.
>>> It's not clear to me whether the problem depends on when we switched to using sk_buff or was pre-existing.
>>>
>>> Do you have any idea when we introduced this issue?
>> git blame shows, that this code exists since first commit to vsock:
>
> Okay, but did we use sk_error_queue before supporting sk_buff?
>
No I think, sk_error_queue was unavailable to user(and still unavailable today),
because we don't have check for MSG_ERRQUEUE flag in recv logic in af_vsock.c
(i've added it in MSG_ZEROCOPY). So even if some subsystem of the kernel inserts
skb to sk_error_queue in AF_VSOCK case, user won't dequeue it.

> Anyway, if we are not sure I think we can use the following Fixes tag,
> I don't see any issue if we backport this patch also before supporting
> sk_buff.
>
Ok, i'll try to prepare reproducer(may be in vsock_test) and add Fixes tag with the
commit "VSOCK: Introduce VM Sockets."

Thanks, Arseniy
> Thanks,
> Stefano
>
>>
>> commit d021c344051af91f42c5ba9fdedc176740cbd238
>> Author: Andy King <acking@xxxxxxxxxx>
>> Date:   Wed Feb 6 14:23:56 2013 +0000
>>
>>    VSOCK: Introduce VM Sockets
>>
>> For TCP same logic was added by:
>>
>> commit 4ed2d765dfaccff5ebdac68e2064b59125033a3b
>> Author: Willem de Bruijn <willemb@xxxxxxxxxx>
>> Date:   Mon Aug 4 22:11:49 2014 -0400
>>
>>    net-timestamp: TCP timestamping
>>
>>
>>>
>>> Thanks,
>>> Stefano
>>>
>>
>> Thanks Arseniy
>>
>>>>
>>>> Signed-off-by: Arseniy Krasnov <AVKrasnov@xxxxxxxxxxxxxx>
>>>> ---
>>>> net/vmw_vsock/af_vsock.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>>>> index 19aea7cba26e..b5e51ef4a74c 100644
>>>> --- a/net/vmw_vsock/af_vsock.c
>>>> +++ b/net/vmw_vsock/af_vsock.c
>>>> @@ -1026,7 +1026,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
>>>>     poll_wait(file, sk_sleep(sk), wait);
>>>>     mask = 0;
>>>>
>>>> -    if (sk->sk_err)
>>>> +    if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue))
>>>>         /* Signify that there has been an error on this socket. */
>>>>         mask |= EPOLLERR;
>>>>
>>>> -- 
>>>> 2.25.1
>>>>
>>>
>>
>