Re: WARNING in ip_recv_error

From: Willem de Bruijn
Date: Wed May 23 2018 - 10:47:52 EST

Next message: Jan Beulich: "Re: [PATCH v5 1/2] xen/PVH: Set up GS segment for stack canary"
Previous message: Mark Brown: "Re: [PATCH v3 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings"
In reply to: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Next in thread: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sun, May 20, 2018 at 7:13 PM, Willem de Bruijn
<willemdebruijn.kernel@xxxxxxxxx> wrote:
> On Fri, May 18, 2018 at 2:59 PM, Willem de Bruijn
> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>> On Fri, May 18, 2018 at 2:46 PM, Willem de Bruijn
>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>> On Fri, May 18, 2018 at 2:44 PM, Willem de Bruijn
>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>> On Fri, May 18, 2018 at 1:09 PM, Willem de Bruijn
>>>> <willemdebruijn.kernel@xxxxxxxxx> wrote:
>>>>> On Fri, May 18, 2018 at 11:44 AM, David Miller <davem@xxxxxxxxxxxxx> wrote:
>>>>>> From: Eric Dumazet <eric.dumazet@xxxxxxxxx>
>>>>>> Date: Fri, 18 May 2018 08:30:43 -0700
>>>>>>
>>>>>>> We probably need to revert Willem patch (7ce875e5ecb8562fd44040f69bda96c999e38bbc)
>>>>>>
>>>>>> Is it really valid to reach ip_recv_err with an ipv6 socket?
>>>>>
>>>>> I guess the issue is that setsockopt IPV6_ADDRFORM is not an
>>>>> atomic operation, so that the socket is neither fully ipv4 nor fully
>>>>> ipv6 by the time it reaches ip_recv_error.
>>>>>
>>>>> sk->sk_socket->ops = &inet_dgram_ops;
>>>>> < HERE >
>>>>> sk->sk_family = PF_INET;
>>>>>
>>>>> Even calling inet_recv_error to demux would not necessarily help.
>>>>>
>>>>> Safest would be to look up by skb->protocol, similar to what
>>>>> ipv6_recv_error does to handle v4-mapped-v6.
>>>>>
>>>>> Or to make that function safe with PF_INET and swap the order
>>>>> of the above two operations.
>>>>>
>>>>> All sound needlessly complicated for this rare socket option, but
>>>>> I don't have a better idea yet. Dropping on the floor is not nice,
>>>>> either.
>>>>
>>>> Ensuring that ip_recv_error correctly handles packets from either
>>>> socket and removing the warning should indeed be good.
>>>>
>>>> It is robust against v4-mapped packets from an AF_INET6 socket,
>>>> but see caveat on reconnect below.
>>>>
>>>> The code between ipv6_recv_error for v4-mapped addresses and
>>>> ip_recv_error is essentially the same, the main difference being
>>>> whether to return network headers as sockaddr_in with SOL_IP
>>>> or sockaddr_in6 with SOL_IPV6.
>>>>
>>>> There are very few other locations in the stack that explicitly test
>>>> sk_family in this way and thus would be vulnerable to races with
>>>> IPV6_ADDRFORM.
>>>>
>>>> I'm not sure whether it is possible for a udpv6 socket to queue a
>>>> real ipv6 packet on the error queue, disconnect, connect to an
>>>> ipv4 address, call IPV6_ADDRFORM and then call ip_recv_error
>>>> on a true ipv6 packet. That would return buggy data, e.g., in
>>>> msg_name.
>>>
>>> In do_ipv6_setsockopt IPV6_ADDRFORM we can test that the
>>> error queue is empty, and then take its lock for the duration of the
>>> operation.
>>
>> Actually, no reason to hold the lock. This setsockopt holds the socket
>> lock, which connect would need, too. So testing that the queue
>> is empty after testing that it is connected to a v4 address is
>> sufficient to ensure that no ipv6 packets are queued for reception.
>>
>> diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
>> index 4d780c7f0130..a975d6311341 100644
>> --- a/net/ipv6/ipv6_sockglue.c
>> +++ b/net/ipv6/ipv6_sockglue.c
>> @@ -199,6 +199,11 @@ static int do_ipv6_setsockopt(struct sock *sk,
>> int level, int optname,
>>
>> if (ipv6_only_sock(sk) ||
>> !ipv6_addr_v4mapped(&sk->sk_v6_daddr)) {
>> retv = -EADDRNOTAVAIL;
>> break;
>> }
>>
>> + if (!skb_queue_empty(&sk->sk_error_queue)) {
>> + retv = -EBUSY;
>> + break;
>> + }
>> +
>> fl6_free_socklist(sk);
>> __ipv6_sock_mc_close(sk);
>>
>> After this it should be safe to remove the warning in ip_recv_error.
>
> Hmm.. nope.
>
> This ensures that the socket cannot produce any new true v6 packets.
> But it does not guarantee that they are not already in the system, e.g.
> queued in tc, and will find their way to the error queue later.
>
> We'll have to just be able to handle ipv6 packets in ip_recv_error.
> Since IPV6_ADDRFORM is used to pass to legacy v4-only
> processes and those likely are only confused by SOL_IPV6
> error messages, it is probably best to just drop them and perhaps
> WARN_ONCE.

Even more fun, this is not limited to the error queue.

I can queue a v6 packet for reception on a socket, connect to a v4
address, call IPV6_ADDRFORM and then a regular recvfrom will
return a partial v6 address as AF_INET.

We definitely do not want to have to add a check

if (skb->protocol == htons(ETH_P_IPV6)) {
kfree_skb(skb);
goto try_again;
}

to the normal recvmsg path.

An alternative may be to tighten the check on when to allow
IPV6_ADDRFORM. Not only return EBUSY if a packet is pending,
but also if any sk_{rmem, omem, wmem}_alloc is non-zero. Only,
these tightened constraints could break a legacy application.

Either way, this race is somewhat tangential to the one that
RaceFuzzer found. The sk changes that IPV6_ADDRFORM makes
to sk_prot, sk_socket->ops and sk_family are not atomic and will
not be. They need not be, because no other code assumes this
consistency.

So I'll start by removing the warning as Eric suggested.

Next message: Jan Beulich: "Re: [PATCH v5 1/2] xen/PVH: Set up GS segment for stack canary"
Previous message: Mark Brown: "Re: [PATCH v3 1/2] regulator: dt-bindings: add QCOM RPMh regulator bindings"
In reply to: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Next in thread: Willem de Bruijn: "Re: WARNING in ip_recv_error"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]