Re: net: hang in unregister_netdevice: waiting for lo to become free

From: Tommi Rantala
Date: Wed Feb 21 2018 - 09:54:12 EST


On 20.02.2018 18:26, Neil Horman wrote:
On Tue, Feb 20, 2018 at 09:14:41AM +0100, Dmitry Vyukov wrote:
On Tue, Feb 20, 2018 at 8:56 AM, Tommi Rantala
<tommi.t.rantala@xxxxxxxxx> wrote:
On 19.02.2018 20:59, Dmitry Vyukov wrote:
Is this meant to be fixed already? I am still seeing this on the
latest upstream tree.


These two commits are in v4.16-rc1:

commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8
Author: Tommi Rantala <tommi.t.rantala@xxxxxxxxx>
Date: Mon Feb 5 21:48:14 2018 +0200

sctp: fix dst refcnt leak in sctp_v4_get_dst
...
Fixes: 410f03831 ("sctp: add routing output fallback")
Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary
addresses")


commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2
Author: Alexey Kodanev <alexey.kodanev@xxxxxxxxxx>
Date: Mon Feb 5 15:10:35 2018 +0300

sctp: fix dst refcnt leak in sctp_v6_get_dst()
...
Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary
addresses for ipv6")


I guess we missed something if it's still reproducible.

I can check it later this week, unless someone else beat me to it.

Hi Tommi,

Hmmm, I can't claim that it's exactly the same bug. Perhaps it's
another one then. But I am still seeing these:

[ 58.799130] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 60.847138] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 62.895093] unregister_netdevice: waiting for lo to become free.
Usage count = 4
[ 64.943103] unregister_netdevice: waiting for lo to become free.
Usage count = 4

on upstream tree pulled ~12 hours ago.

Can you write a systemtap script to probe dev_hold, and dev_put, printing out a
backtrace if the device name matches "lo". That should tell us definitively if
the problem is in the same location or not

Hi Dmitry, I tested with the reproducer and the kernel .config file that you sent in the first email in this thread:

With 4.16-rc2 unable to reproduce.

With 4.15-rc9 bug reproducible, and I get "unregister_netdevice: waiting for lo to become free. Usage count = 3"

With 4.15-rc9 and Alexey's "sctp: fix dst refcnt leak in sctp_v6_get_dst()" cherry-picked on top, unable to reproduce.


Is syzkaller doing something else now to trigger the bug...?
Can you still trigger the bug with the same reproducer?


Tommi