Re: net: hang in unregister_netdevice: waiting for lo to become free

From: Dmitry Vyukov
Date: Thu May 10 2018 - 02:46:38 EST

Next message: Jon Maxwell: "[PATCH net-next v2] tcp: Add mark for TIMEWAIT sockets"
Previous message: Sean Wang: "Re: [PATCH v1 6/7] Bluetooth: hci_mediatek: Add protocol support for MediaTek serial devices"
Next in thread: Dan Streetman: "Re: net: hang in unregister_netdevice: waiting for lo to become free"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Apr 16, 2018 at 9:42 PM, Dan Streetman <ddstreet@xxxxxxxx> wrote:
>>>>> On Wed, Feb 21, 2018 at 3:53 PM, Tommi Rantala
>>>>> <tommi.t.rantala@xxxxxxxxx> wrote:
>>>>>> On 20.02.2018 18:26, Neil Horman wrote:
>>>>>>>
>>>>>>> On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
>>>>>>>>
>>>>>>>> On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
>>>>>>>> <tommi.t.rantala@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On 19.02.2018 20:59, Dmitry Vyukov wrote:
>>>>>>>>>>
>>>>>>>>>> Is this meant to be fixed already? I am still seeing this on the
>>>>>>>>>> latest upstream tree.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> These two commits are in v4.16-rc1:
>>>>>>>>>
>>>>>>>>> commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
>>>>>>>>> Author: Tommi Rantala <tommi.t.rantala@xxxxxxxxx>
>>>>>>>>> Date: Mon Feb 5 21:48:14 2018 +0200
>>>>>>>>>
>>>>>>>>> sctp: fix dst refcnt leak in sctp_v4_get_dst
>>>>>>>>> ...
>>>>>>>>> Fixes: 410f03831 ("sctp: add routing output fallback")
>>>>>>>>> Fixes: 0ca50d12f ("sctp: fix src address selection if using
>>>>>>>>> secondary
>>>>>>>>> addresses")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
>>>>>>>>> Author: Alexey Kodanev <alexey.kodanev@xxxxxxxxxx>
>>>>>>>>> Date: Mon Feb 5 15:10:35 2018 +0300
>>>>>>>>>
>>>>>>>>> sctp: fix dst refcnt leak in sctp_v6_get_dst()
>>>>>>>>> ...
>>>>>>>>> Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using
>>>>>>>>> secondary
>>>>>>>>> addresses for ipv6")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I guess we missed something if it's still reproducible.
>>>>>>>>>
>>>>>>>>> I can check it later this week, unless someone else beat me to it.
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Tommi,
>>>>>>>>
>>>>>>>> Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
>>>>>>>> another one then. But I am still seeing these:
>>>>>>>>
>>>>>>>> [ 58.799130] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 60.847138] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 62.895093] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>> [ 64.943103] unregister_netdevice: waiting for lo to become free.
>>>>>>>> Usage count = 4
>>>>>>>>
>>>>>>>> on upstream tree pulled ~12 hours ago.
>>>>>>>>
>>>>>>> Can you write a systemtap script to probe dev_hold, and dev_put, printing
>>>>>>> out a
>>>>>>> backtrace if the device name matches "lo". That should tell us
>>>>>>> definitively if
>>>>>>> the problem is in the same location or not
>>>>>>
>>>>>>
>>>>>> Hi Dmitry, I tested with the reproducer and the kernel .config file that you
>>>>>> sent in the first email in this thread:
>>>>>>
>>>>>> With 4.16-rc2 unable to reproduce.
>>>>>>
>>>>>> With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for
>>>>>> lo to become free. Usage count = 3"
>>>>>>
>>>>>> With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()"
>>>>>> cherry-picked on top, unable to reproduce.
>>>>>>
>>>>>>
>>>>>> Is syzkaller doing something else now to trigger the bug...?
>>>>>> Can you still trigger the bug with the same reproducer?
>>>>>
>>>>> Hi Neil, Tommi,
>>>>>
>>>>> Reviving this old thread about "unregister_netdevice: waiting for lo
>>>>> to become free. Usage count = 3" hangs.
>>>>> I still did not have time to deep dive into what happens there (too
>>>>> many bugs coming from syzbot). But this still actively happens and I
>>>>> suspect accounts to a significant portion of various hang reports,
>>>>> which are quite unpleasant.
>>>>>
>>>>> One idea that could make it all simpler:
>>>>>
>>>>> Is this wait loop in netdev_wait_allrefs() supposed to wait for any
>>>>> prolonged periods of time under any non-buggy conditions? E.g. more
>>>>> than 1-2 minutes?
>>>>> If it only supposed to wait briefly for things that already supposed
>>>>> to be shutting down, and we add a WARNING there after some timeout,
>>>>> then syzbot will report all info how/when it happens, hopefully
>>>>> extracting reproducers, and all the nice things.
>>>>> But this WARNING should not have any false positives under any
>>>>> realistic conditions (e.g. waiting for arrival of remote packets with
>>>>> large timeouts).
>>>>>
>>>>> Looking at some task hung reports, it seems that this code holds some
>>>>> mutexes, takes workqueue thread and prevents any progress with
>>>>> destruction of other devices (and net namespace creation/destruction),
>>>>> so I guess it should not wait for any indefinite periods of time?
>>>>
>>>> I'm working on this currently:
>>>> https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1711407
>>>>
>>>> I added a summary of what I've found to be the cause (or at least, one
>>>> possible cause) of this:
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407/comments/72
>>>>
>>>> I'm working on a patch to work around the main side-effect of this,
>>>> which is hanging while holding the global net mutex. Hangs will still
>>>> happen (e.g. if a dst leaks) but should not affect anything else,
>>>> other than a leak of the dst and its net namespace.
>>>>
>>>> Fixing the dst leaks is important too, of course, but a dst leak (or
>>>> other cause) shouldn't break the entire system.
>>>
>>> Leaking some memory is definitely better than hanging the system.
>>>
>>> So I've made syzkaller to recognize "unregister_netdevice: waiting for
>>> (.*) to become free" as a kernel bug:
>>> https://github.com/google/syzkaller/commit/7a67784ca8bdc3b26cce2f0ec9a40d2dd9ec9396
>>> Unfortunately it does not make it catch these bugs because creating a
>>> net namespace per test is too damn slow, so namespaces are reused for
>>> lots of tests and when/if it's eventually destroyed it's already too
>>> late to find root cause.
>>>
>>> But I've run a one-off experiment with prompt net namespace
>>> destruction and syzkaller was able to easily extract a C reproducer:
>>> https://gist.githubusercontent.com/dvyukov/d571e8fff24e127ca48a8c4790d42bfa/raw/52050e93ba9afbb5126b9d7bb39b7e71a82af016/gistfile1.txt
>>>
>>> On upstream 16e205cf42da1f497b10a4a24f563e6c0d574eec with this config:
>>> https://gist.githubusercontent.com/dvyukov/9663c57443adb21f2795b92ef0829d62/raw/bbea0652e23746096dd56855a28f6c681aebcdee/gistfile1.txt
>>>
>>> this gives me:
>>>
>>> [ 83.183198] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> [ 85.231202] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> ...
>>> [ 523.511205] unregister_netdevice: waiting for lo to become free.
>>> Usage count = 9
>>> ...
>>>
>>> This is generated from this syzkaller program:
>>>
>>> r0 = socket$inet6(0xa, 0x1, 0x84)
>>> setsockopt$inet6_IPV6_XFRM_POLICY(r0, 0x29, 0x23,
>>> &(0x7f0000000380)={{{@in6=@remote={0xfe, 0x80, [], 0xbb},
>>> @in=@dev={0xac, 0x14, 0x14}, 0x0, 0x0, 0x0, 0x0, 0xa}, {}, {}, 0x0,
>>> 0x0, 0x1}, {{@in=@local={0xac, 0x14, 0x14, 0xaa}, 0x0, 0x32}, 0x0,
>>> @in=@local={0xac, 0x14, 0x14, 0xaa}, 0x3504}}, 0xe8)
>>> bind$inet6(r0, &(0x7f0000000000)={0xa, 0x4e20}, 0x1c)
>>> connect$inet(r0, &(0x7f0000000040)={0x2, 0x4e20, @dev={0xac, 0x14,
>>> 0x14, 0xd}}, 0x10)
>>> syz_emit_ethernet(0x3e, &(0x7f00000001c0)={@local={[0xaa, 0xaa, 0xaa,
>>> 0xaa, 0xaa], 0xaa}, @dev={[0xaa, 0xaa, 0xaa, 0xaa, 0xaa]}, [],
>>> {@ipv6={0x86dd, {0x0, 0x6, "50a09c", 0x8, 0xffffff11, 0x0,
>>> @remote={0xfe, 0x80, [], 0xbb}, @local={0xfe, 0x80, [], 0xaa}, {[],
>>> @udp={0x0, 0x4e20, 0x8}}}}}}, &(0x7f0000000040))
>>>
>>> So this seems to be related to IPv6 and/or xfrm and is potentially
>>> caused by external packets (that syz_emit_ethernet call).
>>
>>
>>
>> Here is another repro which seems to be a different bug (note that it
>> requires fault injection):
>>
>> https://gist.githubusercontent.com/dvyukov/1c56623016cc4c24a69d433c5114ad5b/raw/530478f571b195193101b912aa646948528baa8e/gistfile1.txt
>>
>> Dan, do you mind taking a look at them? Fixing these should eliminate
>> root causes of these hangs/leaks.
>
> Yep I will look at them, thanks for the reproducers.

Hi Dan,

Any updates on this? syzbot is hitting this all the time.

Next message: Jon Maxwell: "[PATCH net-next v2] tcp: Add mark for TIMEWAIT sockets"
Previous message: Sean Wang: "Re: [PATCH v1 6/7] Bluetooth: hci_mediatek: Add protocol support for MediaTek serial devices"
Next in thread: Dan Streetman: "Re: net: hang in unregister_netdevice: waiting for lo to become free"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]