Re: [syzbot] linux-next test error: WARNING in devl_port_unregister

From: Ido Schimmel
Date: Wed Nov 09 2022 - 04:55:33 EST


On Mon, Nov 07, 2022 at 11:02:52AM -0800, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: d8e87774068a Add linux-next specific files for 20221107
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=17b99fde880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=97401fe9f72601bf
> dashboard link: https://syzkaller.appspot.com/bug?extid=c2ca18f0fccdd1f09c66
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/671a9d3d5dc6/disk-d8e87774.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/ef1309efbb19/vmlinux-d8e87774.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/7592dabd2a3a/bzImage-d8e87774.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+c2ca18f0fccdd1f09c66@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
> netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 11 at net/core/devlink.c:9998 devl_port_unregister+0x2f6/0x390 net/core/devlink.c:9998
> Modules linked in:
> CPU: 1 PID: 11 Comm: kworker/u4:1 Not tainted 6.1.0-rc3-next-20221107-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> Workqueue: netns cleanup_net
> RIP: 0010:devl_port_unregister+0x2f6/0x390 net/core/devlink.c:9998
> Code: e8 8f 45 fc f9 85 ed 0f 85 7a fd ff ff e8 b2 48 fc f9 0f 0b e9 6e fd ff ff e8 a6 48 fc f9 0f 0b e9 53 ff ff ff e8 9a 48 fc f9 <0f> 0b e9 94 fd ff ff e8 de f9 48 fa e9 78 ff ff ff e8 a4 f9 48 fa
> RSP: 0018:ffffc90000107a08 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffff888020492810 RCX: 0000000000000000
> RDX: ffff888011a33a80 RSI: ffffffff87809286 RDI: 0000000000000005
> RBP: 0000000000000002 R08: 0000000000000005 R09: 0000000000000000
> R10: 0000000000000002 R11: 0000000000000000 R12: ffff888020492810
> R13: ffff888020492808 R14: ffff888020491800 R15: ffff888020492800
> FS: 0000000000000000(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000c00213d000 CR3: 000000007318b000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> __nsim_dev_port_del+0x1bb/0x240 drivers/net/netdevsim/dev.c:1433
> nsim_dev_port_del_all drivers/net/netdevsim/dev.c:1443 [inline]
> nsim_dev_reload_destroy+0x171/0x510 drivers/net/netdevsim/dev.c:1660
> nsim_dev_reload_down+0x6b/0xd0 drivers/net/netdevsim/dev.c:968
> devlink_reload+0x1c2/0x6b0 net/core/devlink.c:4501
> devlink_pernet_pre_exit+0x104/0x1c0 net/core/devlink.c:12609
> ops_pre_exit_list net/core/net_namespace.c:159 [inline]
> cleanup_net+0x451/0xb10 net/core/net_namespace.c:594
> process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
> worker_thread+0x665/0x1080 kernel/workqueue.c:2436
> kthread+0x2e4/0x3a0 kernel/kthread.c:376
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> </TASK>

Testing the following patch. It fixes a similar issue in our regression.

commit 0cbd9cd1b4dd96f2bd446b47ccef3ece5f24759f
Author: Ido Schimmel <idosch@xxxxxxxxxx>
Date: Wed Nov 9 11:19:55 2022 +0200

devlink: Fix warning when unregistering a port

When a devlink port is unregistered, its type is expected to be unset or
otherwise a WARNING is generated [1]. This was supposed to be handled by
cited commit by clearing the type upon 'NETDEV_PRE_UNINIT'.

The assumption was that no other events can be generated for the netdev
after this event, but this proved to be wrong. After the event is
generated, netdev_wait_allrefs_any() will rebroadcast a
'NETDEV_UNREGISTER' until its reference count drops to 1. This causes
devlink to set the port type back to Ethernet.

Fix by only setting and clearing the port type upon 'NETDEV_POST_INIT'
and 'NETDEV_PRE_UNINIT', respectively. For all other events, preserve
the port type.

[1]
WARNING: CPU: 0 PID: 11 at net/core/devlink.c:9998 devl_port_unregister+0x2f6/0x390 net/core/devlink.c:9998
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u4:1 Not tainted 6.1.0-rc3-next-20221107-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: netns cleanup_net
RIP: 0010:devl_port_unregister+0x2f6/0x390 net/core/devlink.c:9998
[...]
Call Trace:
<TASK>
__nsim_dev_port_del+0x1bb/0x240 drivers/net/netdevsim/dev.c:1433
nsim_dev_port_del_all drivers/net/netdevsim/dev.c:1443 [inline]
nsim_dev_reload_destroy+0x171/0x510 drivers/net/netdevsim/dev.c:1660
nsim_dev_reload_down+0x6b/0xd0 drivers/net/netdevsim/dev.c:968
devlink_reload+0x1c2/0x6b0 net/core/devlink.c:4501
devlink_pernet_pre_exit+0x104/0x1c0 net/core/devlink.c:12609
ops_pre_exit_list net/core/net_namespace.c:159 [inline]
cleanup_net+0x451/0xb10 net/core/net_namespace.c:594
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>

Fixes: 02a68a47eade ("net: devlink: track netdev with devlink_port assigned")
Reported-by: syzbot+85e47e1a08b3e159b159@xxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: syzbot+c2ca18f0fccdd1f09c66@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ido Schimmel <idosch@xxxxxxxxxx>

diff --git a/net/core/devlink.c b/net/core/devlink.c
index 6bbe230c4ec5..7f789bbcbbd7 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -10177,7 +10177,7 @@ static int devlink_netdevice_event(struct notifier_block *nb,
* we take into account netdev pointer appearing in this
* namespace.
*/
- __devlink_port_type_set(devlink_port, DEVLINK_PORT_TYPE_ETH,
+ __devlink_port_type_set(devlink_port, devlink_port->type,
netdev);
break;
case NETDEV_UNREGISTER:
@@ -10185,7 +10185,7 @@ static int devlink_netdevice_event(struct notifier_block *nb,
* also during net namespace change so we need to clear
* pointer to netdev that is going to another net namespace.
*/
- __devlink_port_type_set(devlink_port, DEVLINK_PORT_TYPE_ETH,
+ __devlink_port_type_set(devlink_port, devlink_port->type,
NULL);
break;
case NETDEV_PRE_UNINIT: