Re: [PATCH net] bonding: Fix warning in default_device_exit_batch()
From: Nikolay Aleksandrov
Date: Sun Mar 12 2023 - 16:58:29 EST
On 12/03/2023 17:21, Shigeru Yoshida wrote:
> syzbot reported warning in default_device_exit_batch() like below [1]:
>
> WARNING: CPU: 1 PID: 56 at net/core/dev.c:10867 unregister_netdevice_many_notify+0x14cf/0x19f0 net/core/dev.c:10867
> ...
> Call Trace:
> <TASK>
> unregister_netdevice_many net/core/dev.c:10897 [inline]
> default_device_exit_batch+0x451/0x5b0 net/core/dev.c:11350
> ops_exit_list+0x125/0x170 net/core/net_namespace.c:174
> cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613
> process_one_work+0x9bf/0x1820 kernel/workqueue.c:2390
> worker_thread+0x669/0x1090 kernel/workqueue.c:2537
> kthread+0x2e8/0x3a0 kernel/kthread.c:376
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> </TASK>
>
> For bond devices which also has a master device, IFF_SLAVE flag is
> cleared at err_undo_flags label in bond_enslave() if it is not
> ARPHRD_ETHER type. In this case, __bond_release_one() is not called
> when bond_netdev_event() received NETDEV_UNREGISTER event. This
> causes the above warning.
>
> This patch fixes this issue by setting IFF_SLAVE flag at
> err_undo_flags label in bond_enslave() if the bond device has a master
> device.
>
The proper way is to check if the bond device had the IFF_SLAVE flag before the
ether_setup() call which clears it, and restore it after.
> Fixes: 7d5cd2ce5292 ("bonding: correctly handle bonding type change on enslave failure")
> Cc: Nikolay Aleksandrov <nikolay@xxxxxxxxxxxxxxxxxxx>
> Link: https://syzkaller.appspot.com/bug?id=391c7b1f6522182899efba27d891f1743e8eb3ef [1]
> Reported-by: syzbot+9dfc3f3348729cc82277@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Shigeru Yoshida <syoshida@xxxxxxxxxx>
> ---
> drivers/net/bonding/bond_main.c | 2 ++
> include/net/bonding.h | 5 +++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 00646aa315c3..1a8b59e1468d 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2291,6 +2291,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> dev_close(bond_dev);
> ether_setup(bond_dev);
> bond_dev->flags |= IFF_MASTER;
> + if (bond_has_master(bond))
> + bond_dev->flags |= IFF_SLAVE;
> bond_dev->priv_flags &= ~IFF_TX_SKB_SHARING;
> }
> }
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index ea36ab7f9e72..ed0b49501fad 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -57,6 +57,11 @@
>
> #define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
>
> +/* master list primitives */
> +#define bond_master_list(bond) (&(bond)->dev->adj_list.upper)
> +
> +#define bond_has_master(bond) !list_empty(bond_master_list(bond))
> +
This is not the proper way to check for a master device.
> /* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
> #define bond_first_slave(bond) \
> (bond_has_slaves(bond) ? \
The device flags are wrong because of ether_setup() which clears IFF_SLAVE, we should
just check if it was present before and restore it after the ether_setup() call.
I'll send a fix tomorrow after testing it.
Thanks,
Nik