Re: WARNING in ip_recv_error

From: Willem de Bruijn
Date: Sun May 20 2018 - 18:19:04 EST


On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
<willemdebruijn.kernel@xxxxxxxxx> wrote:
> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>> On Fri, May 18, 2018 at 11:44 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>>>>> From: Eric Dumazet <eric.dumazet@xxxxxxxxx>
>>>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>>>
>>>>>> We probably need to revert Willem patch (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>>>
>>>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>>
>>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>>> ipv6 by the time it reaches ip_recv_error.
>>>>
>>>> sk->sk_socket->ops = &inet_dgram_ops;
>>>> < HERE >
>>>> sk->sk_family = PF_INET;
>>>>
>>>> Even calling inet_recv_error to demux would not necessarily help.
>>>>
>>>> Safest would be to look up by skb->protocol, similar to what
>>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>>
>>>> Or to make that function safe with PF_INET and swap the order
>>>> of the above two operations.
>>>>
>>>> All sound needlessly complicated for this rare socket option, but
>>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>>> either.
>>>
>>> Ensuring that ip_recv_error correctly handles packets from either
>>> socket and removing the warning should indeed be good.
>>>
>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>> but see caveat on reconnect below.
>>>
>>> The code between ipv6_recv_error for v4-mapped addresses and
>>> ip_recv_error is essentially the same, the main difference being
>>> whether to return network headers as sockaddr_in with SOL_IP
>>> or sockaddr_in6 with SOL_IPV6.
>>>
>>> There are very few other locations in the stack that explicitly test
>>> sk_family in this way and thus would be vulnerable to races with
>>> IPV6_ADDRFORM.
>>>
>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>> real ipv6 packet on the error queue, disconnect, connect to an
>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>> msg_name.
>>
>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>> error queue is empty, and then take its lock for the duration of the
>> operation.
>
> Actually, no reason to hold the lock. This setsockopt holds the socket
> lock, which connect would need, too. So testing that the queue
> is empty after testing that it is connected to a v4 address is
> sufficient to ensure that no ipv6 packets are queued for reception.
>
> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
> index 4d780c7f0130..a975d6311341 100644
> --- a/net/ipv6/ipv6_sockglue.c
> +++ b/net/ipv6/ipv6_sockglue.c
> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
> int level, int optname,
>
> if (ipv6_only_sock(sk) ||
> !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) {
> retv = -EADDRNOTAVAIL;
> break;
> }
>
> + if (!skb_queue_empty(&sk->sk_error_queue)) {
> + retv = -EBUSY;
> + break;
> + }
> +
> fl6_free_socklist(sk);
> __ipv6_sock_mc_close(sk);
>
> After this it should be safe to remove the warning in ip_recv_error.

Hmm.. nope.

This ensures that the socket cannot produce any new true v6 packets.
But it does not guarantee that they are not already in the system, e.g.
queued in tc, and will find their way to the error queue later.

We'll have to just be able to handle ipv6 packets in ip_recv_error.
Since IPV6_ADDRFORM is used to pass to legacy v4-only
processes and those likely are only confused by SOL_IPV6
error messages, it is probably best to just drop them and perhaps
WARN_ONCE.