Re: WARNING in ib_unregister_device_queued

From: Jason Gunthorpe
Date: Tue Apr 28 2020 - 08:31:07 EST


On Tue, Apr 28, 2020 at 12:19:55PM +0800, Hillf Danton wrote:
>
> Sun, 26 Apr 2020 06:43:13 -0700
> > syzbot found the following crash on:
> >
> > HEAD commit: b9663b7c net: stmmac: Enable SERDES power up/down sequence
> > git tree: net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=166bf717e00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=4088ed905e4ae2b0e13b
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+4088ed905e4ae2b0e13b@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > rdma_rxe: ignoring netdev event = 10 for netdevsim0
> > infiniband yz2: set down
> > WARNING: CPU: 0 PID: 22753 at drivers/infiniband/core/device.c:1565 ib_unregister_device_queued+0x122/0x160 drivers/infiniband/core/device.c:1565
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 0 PID: 22753 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> > __dump_stack lib/dump_stack.c:77 [inline]
> > dump_stack+0x188/0x20d lib/dump_stack.c:118
> > panic+0x2e3/0x75c kernel/panic.c:221
> > __warn.cold+0x2f/0x35 kernel/panic.c:582
> > report_bug+0x27b/0x2f0 lib/bug.c:195
> > fixup_bug arch/x86/kernel/traps.c:175 [inline]
> > fixup_bug arch/x86/kernel/traps.c:170 [inline]
> > do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> > do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> > invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> > RIP: 0010:ib_unregister_device_queued+0x122/0x160 drivers/infiniband/core/device.c:1565
> > Code: fb e8 72 e2 d4 fb 48 89 ef e8 2a c3 c1 fe 48 83 c4 08 5b 5d e9 5f e2 d4 fb e8 5a e2 d4 fb 0f 0b e9 46 ff ff ff e8 4e e2 d4 fb <0f> 0b e9 6f ff ff ff 48 89 ef e8 2f a9 12 fc e9 16 ff ff ff 48 c7
> > RSP: 0018:ffffc900072ef290 EFLAGS: 00010246
> > RAX: 0000000000040000 RBX: ffff8880a6a24000 RCX: ffffc90013201000
> > RDX: 0000000000040000 RSI: ffffffff859e51b2 RDI: ffff8880a6a24310
> > RBP: 0000000000000019 R08: ffff88808d21c280 R09: ffffed1014d449bb
> > R10: ffff8880a6a24dd3 R11: ffffed1014d449ba R12: 0000000000000006
> > R13: ffff88805988c000 R14: 0000000000000000 R15: ffffffff8a44f8c0
> > rxe_notify+0x77/0xd0 drivers/infiniband/sw/rxe/rxe_net.c:605
> > notifier_call_chain+0xc0/0x230 kernel/notifier.c:83
> > call_netdevice_notifiers_info net/core/dev.c:1948 [inline]
> > call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1933
> > call_netdevice_notifiers_extack net/core/dev.c:1960 [inline]
> > call_netdevice_notifiers net/core/dev.c:1974 [inline]
> > rollback_registered_many+0x75c/0xe70 net/core/dev.c:8828
> > rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
> > unregister_netdevice_queue net/core/dev.c:9969 [inline]
> > unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
> > unregister_netdevice include/linux/netdevice.h:2725 [inline]
> > nsim_destroy+0x35/0x60 drivers/net/netdevsim/netdev.c:330
> > __nsim_dev_port_del+0x144/0x1e0 drivers/net/netdevsim/dev.c:934
> > nsim_dev_port_del_all+0x86/0xe0 drivers/net/netdevsim/dev.c:947
> > nsim_dev_reload_destroy+0x77/0x110 drivers/net/netdevsim/dev.c:1123
> > nsim_dev_reload_down+0x6e/0xd0 drivers/net/netdevsim/dev.c:703
> > devlink_reload+0xbd/0x3b0 net/core/devlink.c:2797
> > devlink_nl_cmd_reload+0x2f7/0x7c0 net/core/devlink.c:2832
> > genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
> > genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
> > genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
> > netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
> > genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
> > netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
> > netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
> > netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
> > sock_sendmsg_nosec net/socket.c:652 [inline]
> > sock_sendmsg+0xcf/0x120 net/socket.c:672
> > ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
> > ___sys_sendmsg+0x100/0x170 net/socket.c:2416
> > __sys_sendmsg+0xec/0x1b0 net/socket.c:2449
> > do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
> > entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
>
> Quiesce the warning by adding a dummy destruct function and using it in
> the error path that is supposedly related to triggering the warning.

Yeah, something like that might work if this is the source of the bug

Thanks,
Jason