Re: net: hang in unregister_netdevice: waiting for lo to become free

From: Xin Long
Date: Wed Jan 31 2018 - 19:49:22 EST


On Tue, Jan 30, 2018 at 11:59 PM, David Ahern <dsahern@xxxxxxxxx> wrote:
> On 1/30/18 1:57 PM, David Ahern wrote:
>> On 1/30/18 1:08 PM, Daniel Borkmann wrote:
>>> On 01/30/2018 07:32 PM, Cong Wang wrote:
>>>> On Tue, Jan 30, 2018 at 4:09 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>>> Hello,
>>>>>
>>>>> The following program creates a hang in unregister_netdevice.
>>>>> cleanup_net work hangs there forever periodically printing
>>>>> "unregister_netdevice: waiting for lo to become free. Usage count = 3"
>>>>> and creation of any new network namespaces hangs forever.
>>>>
>>>> Interestingly, this is not reproducible on net-next.
>>>
>>> The most recent change on netns refcnt was 4ee806d51176 ("net: tcp: close
>>> sock if net namespace is exiting") in net/net-next from 5 days ago, maybe
>>> fixed due to that?
>>>
>>
>> This appears to be the commit introducing the refcnt leak:
>>
>> $ git bisect bad
>> dbc2b5e9a09e9a6664679a667ff81cff6e5f2641 is the first bad commit
>> commit dbc2b5e9a09e9a6664679a667ff81cff6e5f2641
>> Author: Xin Long <lucien.xin@xxxxxxxxx>
>> Date: Fri May 12 14:39:52 2017 +0800
>>
>> sctp: fix src address selection if using secondary addresses for ipv6
>>
>>
>> v4.14 is bad. Running bisect in the background while doing other things....
>>
>
> Interesting. The commit that avoids the refcnt leak is
>
> commit 955ec4cb3b54c7c389a9f830be7d3ae2056b9212
> Author: David Ahern <dsahern@xxxxxxxxx>
> Date: Wed Jan 24 19:45:29 2018 -0800
>
> net/ipv6: Do not allow route add with a device that is down
>
> That commit does not intentionally address the problem so it is just
> masking the problematic code introduced by the commit above.
Thanks, David A.

I'm still on a trip. will look into this asap.