Re: net: hang in unregister_netdevice: waiting for lo to become free

From: Dmitry Vyukov
Date: Mon Feb 19 2018 - 14:00:11 EST


On Sat, Feb 3, 2018 at 1:15 PM, Xin Long <lucien.xin@xxxxxxxxx> wrote:
>>> On 1/30/18 1:57 PM, David Ahern wrote:
>>>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> The following program creates a hang in unregister_netdevice.
>>>>>>> cleanup_net work hangs there forever periodically printing
>>>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>>>> and creation of any new network namespaces hangs forever.
>>>>>>
>>>>>> Interestingly, this is not reproducible on net-next.
>>>>>
>>>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>>>> fixed due to that?
>>>>>
>>>>
>>>> This appears to be the commit introducing the refcnt leak:
>>>>
>>>> $ git bisect bad
>>>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>>>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>>>> Author: Xin Long <lucien.xin@xxxxxxxxx>
>>>> Date: Fri May 12 14:39:52 2017 +0800
>>>>
>>>> sctp: fix src address selection if using secondary addresses for ipv6
>>>>
>>>>
>>>> v4.14 is bad. Running bisect in the background while doing other things....
>>>>
>>>
>>> Interesting. The commit that avoids the refcnt leak is
>>>
>>> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
>>> Author: David Ahern <dsahern@xxxxxxxxx>
>>> Date: Wed Jan 24 19:45:29 2018 -0800
>>>
>>> net/ipv6: Do not allow route add with a device that is down
>>>
>>> That commit does not intentionally address the problem so it is just
>>> masking the problematic code introduced by the commit above.
>> Thanks, David A.
>>
>> I'm still on a trip. will look into this asap.
>
> Alexey and Tommi already had the patches for this issue on
> both SCTP v4 and v6 dst_get, Thanks.



Is this meant to be fixed already? I am still seeing this on the
latest upstream tree.