Re: INFO: task hung in vhost_init_device_iotlb
From: Michael S. Tsirkin
Date: Tue Jan 29 2019 - 11:06:01 EST
On Tue, Jan 29, 2019 at 01:22:02AM -0800, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: 983542434e6b Merge tag 'edac_fix_for_5.0' of git://git.ker..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=17476498c00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=505743eba4e4f68
> dashboard link: https://syzkaller.appspot.com/bug?extid=40e28a8bd59d10ed0c42
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
Hmm nothing obvious below. Generic corruption elsewhere?
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+40e28a8bd59d10ed0c42@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> protocol 88fb is buggy, dev hsr_slave_1
> protocol 88fb is buggy, dev hsr_slave_0
> protocol 88fb is buggy, dev hsr_slave_1
> protocol 88fb is buggy, dev hsr_slave_0
> protocol 88fb is buggy, dev hsr_slave_1
> INFO: task syz-executor5:9417 blocked for more than 140 seconds.
> Not tainted 5.0.0-rc3+ #48
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor5 D27576 9417 8469 0x00000004
> Call Trace:
> context_switch kernel/sched/core.c:2831 [inline]
> __schedule+0x897/0x1e60 kernel/sched/core.c:3472
> schedule+0xfe/0x350 kernel/sched/core.c:3516
> protocol 88fb is buggy, dev hsr_slave_0
> protocol 88fb is buggy, dev hsr_slave_1
> schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3574
> __mutex_lock_common kernel/locking/mutex.c:1002 [inline]
> __mutex_lock+0xa3b/0x1670 kernel/locking/mutex.c:1072
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
> vhost_init_device_iotlb+0x124/0x280 drivers/vhost/vhost.c:1606
> vhost_net_set_features drivers/vhost/net.c:1674 [inline]
> vhost_net_ioctl+0x1282/0x1c00 drivers/vhost/net.c:1739
> vfs_ioctl fs/ioctl.c:46 [inline]
> file_ioctl fs/ioctl.c:509 [inline]
> do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
> ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> __do_sys_ioctl fs/ioctl.c:720 [inline]
> __se_sys_ioctl fs/ioctl.c:718 [inline]
> __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
> protocol 88fb is buggy, dev hsr_slave_0
> protocol 88fb is buggy, dev hsr_slave_1
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x458099
> Code: Bad RIP value.
> RSP: 002b:00007efd7ca9bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
> RDX: 0000000020000080 RSI: 000000004008af00 RDI: 0000000000000003
> RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007efd7ca9c6d4
> R13: 00000000004c295b R14: 00000000004d5280 R15: 00000000ffffffff
> INFO: task syz-executor5:9418 blocked for more than 140 seconds.
> Not tainted 5.0.0-rc3+ #48
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor5 D27800 9418 8469 0x00000004
> Call Trace:
> context_switch kernel/sched/core.c:2831 [inline]
> __schedule+0x897/0x1e60 kernel/sched/core.c:3472
> schedule+0xfe/0x350 kernel/sched/core.c:3516
> schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3574
> __mutex_lock_common kernel/locking/mutex.c:1002 [inline]
> __mutex_lock+0xa3b/0x1670 kernel/locking/mutex.c:1072
> mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
> vhost_net_set_owner drivers/vhost/net.c:1697 [inline]
> vhost_net_ioctl+0x426/0x1c00 drivers/vhost/net.c:1754
> vfs_ioctl fs/ioctl.c:46 [inline]
> file_ioctl fs/ioctl.c:509 [inline]
> do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
> ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> __do_sys_ioctl fs/ioctl.c:720 [inline]
> __se_sys_ioctl fs/ioctl.c:718 [inline]
> __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x458099
> Code: Bad RIP value.
> RSP: 002b:00007efd7ca7ac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
> RDX: 0000000000000000 RSI: 000040010000af01 RDI: 0000000000000003
> RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007efd7ca7b6d4
> R13: 00000000004c33a4 R14: 00000000004d5e80 R15: 00000000ffffffff
>
> Showing all locks held in the system:
> 1 lock held by khungtaskd/1040:
> #0: 00000000b7479fbe (rcu_read_lock){....}, at:
> debug_show_all_locks+0xc6/0x41d kernel/locking/lockdep.c:4389
> 1 lock held by rsyslogd/8285:
> #0: 000000006d9ccf7d (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1b3/0x1f0
> fs/file.c:795
> 2 locks held by getty/8406:
> #0: 00000000052e805b (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 00000000b90dc267 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8407:
> #0: 000000009fdef632 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 00000000ff2b1a16 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8408:
> #0: 00000000e48a8e78 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 000000008fcf2060 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8409:
> #0: 0000000063f3f4f5 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 000000001dc973ca (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8410:
> #0: 00000000f3c14150 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 000000007987cec5 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8411:
> #0: 00000000d04f4305 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 000000003f47e3a6 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by getty/8412:
> #0: 0000000082430560 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> drivers/tty/tty_ldsem.c:341
> #1: 0000000094609d81 (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> 2 locks held by syz-executor5/9417:
> #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at: vhost_net_set_features
> drivers/vhost/net.c:1668 [inline]
> #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at:
> vhost_net_ioctl+0x204/0x1c00 drivers/vhost/net.c:1739
> #1: 00000000a7b5872b (&vq->mutex){+.+.}, at:
> vhost_init_device_iotlb+0x124/0x280 drivers/vhost/vhost.c:1606
> 1 lock held by syz-executor5/9418:
> #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at: vhost_net_set_owner
> drivers/vhost/net.c:1697 [inline]
> #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at:
> vhost_net_ioctl+0x426/0x1c00 drivers/vhost/net.c:1754
> 1 lock held by vhost-9408/9413:
>
> =============================================
>
> NMI backtrace for cpu 0
> CPU: 0 PID: 1040 Comm: khungtaskd Not tainted 5.0.0-rc3+ #48
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
> nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101
> nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62
> arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> check_hung_uninterruptible_tasks kernel/hung_task.c:203 [inline]
> watchdog+0xbbb/0x1170 kernel/hung_task.c:287
> kthread+0x357/0x430 kernel/kthread.c:246
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> Sending NMI from CPU 0 to CPUs 1:
> NMI backtrace for cpu 1
> CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0-rc3+ #48
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: bat_events batadv_nc_worker
> RIP: 0010:__sanitizer_cov_trace_const_cmp1+0x15/0x20 kernel/kcov.c:174
> Code: 00 48 89 e5 48 8b 4d 08 e8 18 ff ff ff 5d c3 66 0f 1f 44 00 00 55 40
> 0f b6 d6 40 0f b6 f7 bf 01 00 00 00 48 89 e5 48 8b 4d 08 <e8> f6 fe ff ff 5d
> c3 0f 1f 40 00 55 0f b7 d6 0f b7 f7 bf 03 00 00
> RSP: 0018:ffff8880a947f8a8 EFLAGS: 00000246
> RAX: ffff8880a94701c0 RBX: ffff8880a05efc40 RCX: ffffffff87d36c97
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
> RBP: ffff8880a947f8a8 R08: ffff8880a94701c0 R09: ffffed1015ce5b90
> R10: ffffed1015ce5b8f R11: ffff8880ae72dc7b R12: 0000000000000000
> R13: 0000000000000000 R14: 000000000000019e R15: dffffc0000000000
> FS: 0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffff600400 CR3: 00000000a005a000 CR4: 00000000001426e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> rcu_read_unlock include/linux/rcupdate.h:657 [inline]
> batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:423 [inline]
> batadv_nc_worker+0x2f7/0x920 net/batman-adv/network-coding.c:730
> process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
> worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
> kthread+0x357/0x430 kernel/kthread.c:246
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxxx
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.