Re: unregister_netdevice: waiting for DEV to become free (2)

From: Julian Anastasov
Date: Mon Aug 20 2018 - 08:55:28 EST



Hello,

On Sun, 19 Aug 2018, syzbot wrote:

> syzbot has found a reproducer for the following crash on:
>
> HEAD commit: d7857ae43dcc Add linux-next specific files for 20180817
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13c72fce400000
> kernel config: https://syzkaller.appspot.com/x/.config?x=4b10cd1ea76bb092
> dashboard link: https://syzkaller.appspot.com/bug?extid=30209ea299c09d8785c9
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=15df679a400000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15242741400000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+30209ea299c09d8785c9@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> IPVS: stopping master sync thread 4657 ...
> IPVS: stopping master sync thread 4663 ...
> IPVS: sync thread started: state = MASTER, mcast_ifn = syz_tun, syncid = 0, id
> IPVS: = 0
> IPVS: sync thread started: state = MASTER, mcast_ifn = syz_tun, syncid = 0, id
> IPVS: = 0
> IPVS: stopping master sync thread 4664 ...
> unregister_netdevice: waiting for lo to become free. Usage count = 1

Well, only IPVS and tun in the game? But IPVS does not
take any dev references for sync threads. Can it be a problem
in tun? For example, a side effects from dst_cache_reset?
May be dst_release is called too late? Here is what should happen
on unregistration:

- NETDEV_UNREGISTER event: rt_flush_dev changes dst->dev with lo
but dst is not released

- ndo_uninit/ip_tunnel_uninit: dst_cache_reset is called which
does nothing!?! May be dst_release call is needed here.

- no more references are expected here ...

- netdev_run_todo -> netdev_wait_allrefs: loop here due to refcnt!=0

- dev->priv_destructor (ip_tunnel_dev_free) calls dst_cache_destroy
where dst_release is used but it is not reached because we loop in
netdev_wait_allrefs above

- dst_cache_destroy: really call dst_release

In fact, after calling rt_flush_dev and replacing the
dst->dev we should reach dev->priv_destructor (ip_tunnel_dev_free)
for tun device where dst_release for lo should be called. But may be
something prevents it, exit batching?

Regards

--
Julian Anastasov <ja@xxxxxx>