Re: WARNING in ip_recv_error

From: Willem de Bruijn
Date: Wed May 23 2018 - 13:38:12 EST

Next message: David Miller: "Re: [PATCH net] tuntap: correctly set SOCKWQ_ASYNC_NOSPACE"
Previous message: Ravi Chandra Sadineni: "[PATCH] cros_ec_keyb: Increment the wakeup count to the specific mfd device."
In reply to: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, May 23, 2018 at 11:40 AM, Willem de Bruijn
<willemdebruijn.kernel@xxxxxxxxx> wrote:
> On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>>>> On Fri, May 18, 2018 at 11:44 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>>>>>>> From: Eric Dumazet <eric.dumazet@xxxxxxxxx>
>>>>>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>>>>>
>>>>>>>> We probably need to revert Willem patch (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>>>>>
>>>>>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>>>>
>>>>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>>>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>>>>> ipv6 by the time it reaches ip_recv_error.
>>>>>>
>>>>>> sk->sk_socket->ops = &inet_dgram_ops;
>>>>>> < HERE >
>>>>>> sk->sk_family = PF_INET;
>>>>>>
>>>>>> Even calling inet_recv_error to demux would not necessarily help.
>>>>>>
>>>>>> Safest would be to look up by skb->protocol, similar to what
>>>>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>>>>
>>>>>> Or to make that function safe with PF_INET and swap the order
>>>>>> of the above two operations.
>>>>>>
>>>>>> All sound needlessly complicated for this rare socket option, but
>>>>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>>>>> either.
>>>>>
>>>>> Ensuring that ip_recv_error correctly handles packets from either
>>>>> socket and removing the warning should indeed be good.
>>>>>
>>>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>>>> but see caveat on reconnect below.
>>>>>
>>>>> The code between ipv6_recv_error for v4-mapped addresses and
>>>>> ip_recv_error is essentially the same, the main difference being
>>>>> whether to return network headers as sockaddr_in with SOL_IP
>>>>> or sockaddr_in6 with SOL_IPV6.
>>>>>
>>>>> There are very few other locations in the stack that explicitly test
>>>>> sk_family in this way and thus would be vulnerable to races with
>>>>> IPV6_ADDRFORM.
>>>>>
>>>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>>>> real ipv6 packet on the error queue, disconnect, connect to an
>>>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>>>> msg_name.
>>>>
>>>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>>>> error queue is empty, and then take its lock for the duration of the
>>>> operation.
>>>
>>> Actually, no reason to hold the lock. This setsockopt holds the socket
>>> lock, which connect would need, too. So testing that the queue
>>> is empty after testing that it is connected to a v4 address is
>>> sufficient to ensure that no ipv6 packets are queued for reception.
>>>
>>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>>> index 4d780c7f0130..a975d6311341 100644
>>> --- a/net/ipv6/ipv6_sockglue.c
>>> +++ b/net/ipv6/ipv6_sockglue.c
>>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>>> int level, int optname,
>>>
>>> if (ipv6_only_sock(sk) ||
>>> !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) {
>>> retv = -EADDRNOTAVAIL;
>>> break;
>>> }
>>>
>>> + if (!skb_queue_empty(&sk->sk_error_queue)) {
>>> + retv = -EBUSY;
>>> + break;
>>> + }
>>> +
>>> fl6_free_socklist(sk);
>>> __ipv6_sock_mc_close(sk);
>>>
>>> After this it should be safe to remove the warning in ip_recv_error.
>>
>> Hmm.. nope.
>>
>> This ensures that the socket cannot produce any new true v6 packets.
>> But it does not guarantee that they are not already in the system, e.g.
>> queued in tc, and will find their way to the error queue later.
>>
>> We'll have to just be able to handle ipv6 packets in ip_recv_error.
>> Since IPV6_ADDRFORM is used to pass to legacy v4-only
>> processes and those likely are only confused by SOL_IPV6
>> error messages, it is probably best to just drop them and perhaps
>> WARN_ONCE.
>
> Even more fun, this is not limited to the error queue.
>
> I can queue a v6 packet for reception on a socket, connect to a v4
> address, call IPV6_ADDRFORM and then a regular recvfrom will
> return a partial v6 address as AF_INET.
>
> We definitely do not want to have to add a check
>
> if (skb->protocol == htons(ETH_P_IPV6)) {
> kfree_skb(skb);
> goto try_again;
> }
>
> to the normal recvmsg path.
>
> An alternative may be to tighten the check on when to allow
> IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
> but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
> these tightened constraints could break a legacy application.
>
> Either way, this race is somewhat tangential to the one that
> RaceFuzzer found. The sk changes that IPV6_ADDRFORM makes
> to sk_prot, sk_socket->ops and sk_family are not atomic and will
> not be. They need not be, because no other code assumes this
> consistency.
>
> So I'll start by removing the warning as Eric suggested.

http://patchwork.ozlabs.org/patch/919270/

Next message: David Miller: "Re: [PATCH net] tuntap: correctly set SOCKWQ_ASYNC_NOSPACE"
Previous message: Ravi Chandra Sadineni: "[PATCH] cros_ec_keyb: Increment the wakeup count to the specific mfd device."
In reply to: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]