Re: unregister_netdevice: waiting for DEV to become free (2)
From: Cong Wang
Date: Tue Aug 21 2018 - 01:40:12 EST
On Mon, Aug 20, 2018 at 6:00 AM Julian Anastasov <ja@xxxxxx> wrote:
>
>
> Hello,
>
> On Sun, 19 Aug 2018, syzbot wrote:
>
> > syzbot has found a reproducer for the following crash on:
> >
> > HEAD commit: d7857ae43dcc Add linux-next specific files for 20180817
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13c72fce400000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=4b10cd1ea76bb092
> > dashboard link: https://syzkaller.appspot.com/bug?extid=30209ea299c09d8785c9
> > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15df679a400000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15242741400000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+30209ea299c09d8785c9@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > IPVS: stopping master sync thread 4657 ...
> > IPVS: stopping master sync thread 4663 ...
> > IPVS: sync thread started: state = MASTER, mcast_ifn = syz_tun, syncid = 0, id
> > IPVS: = 0
> > IPVS: sync thread started: state = MASTER, mcast_ifn = syz_tun, syncid = 0, id
> > IPVS: = 0
> > IPVS: stopping master sync thread 4664 ...
> > unregister_netdevice: waiting for lo to become free. Usage count = 1
>
> Well, only IPVS and tun in the game? But IPVS does not
> take any dev references for sync threads. Can it be a problem
> in tun? For example, a side effects from dst_cache_reset?
> May be dst_release is called too late? Here is what should happen
> on unregistration:
There are multiple similar bugs grouped together under this, perhaps
they are different, perhaps they are a same bug, too early to say.
For the one I look into, dst_cache doesn't matter, because the xmit
path doesn't even use tunnel dst_cache at all, and it is ip6tnl0 FB
device, unlike this one which is tun device.
>
> - NETDEV_UNREGISTER event: rt_flush_dev changes dst->dev with lo
> but dst is not released
>
> - ndo_uninit/ip_tunnel_uninit: dst_cache_reset is called which
> does nothing!?! May be dst_release call is needed here.
I think this makes sense, at least prior to the general dst_cache
introduction, dst refcnt was released in ndo_uninit() too, so it
is reasonable to move the dst_cache_destroy() to ndo_uninit().
>
> - no more references are expected here ...
>
> - netdev_run_todo -> netdev_wait_allrefs: loop here due to refcnt!=0
>
> - dev->priv_destructor (ip_tunnel_dev_free) calls dst_cache_destroy
> where dst_release is used but it is not reached because we loop in
> netdev_wait_allrefs above
>
> - dst_cache_destroy: really call dst_release
>
> In fact, after calling rt_flush_dev and replacing the
> dst->dev we should reach dev->priv_destructor (ip_tunnel_dev_free)
> for tun device where dst_release for lo should be called. But may be
> something prevents it, exit batching?
I can't see anything in netnns exit batch is any special here.
For the one I look into, it seems some fib6_info is not released for
some reason. It seems to be the one created by addrconf_prefix_route(),
which is supposed to be released by fib6_clean_tree() I think, but it
never happens.
Thanks.