Re: kvm: deadlock between kvm_io_bus_register_dev/kvm_hv_set_msr_common

From: Paolo Bonzini
Date: Fri Dec 09 2016 - 14:13:16 EST




On 09/12/2016 17:09, Dmitry Vyukov wrote:
> [resending an plain test]
>
> Hello,
>
> While running syzkaller fuzzer I am getting tasks deadlocked at the
> following stacks. Seems that they are waiting for each other:

This is pretty simple, kvm_hv_get/set_msr_common are taking kvm->lock
inside vcpu->mutex which is wrong. We need to split Hyper-V emulation
code to take a different mutex or spinlock.

Paolo

> # cat /proc/6715/task/*/stack
> [<ffffffff815ba488>] __synchronize_srcu+0x2f8/0x4a0 kernel/rcu/srcu.c:448
> [<ffffffff815ba693>] synchronize_srcu_expedited+0x13/0x20 kernel/rcu/srcu.c:510
> [<ffffffff810884db>] kvm_io_bus_register_dev+0x2ab/0x3e0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3559
> [<ffffffff8117f4a6>] kvm_create_pit+0x5c6/0x8c0 arch/x86/kvm/i8254.c:694
> [<ffffffff810df9b3>] kvm_arch_vm_ioctl+0x14d3/0x24b0 arch/x86/kvm/x86.c:3978
> [<ffffffff8108270a>] kvm_vm_ioctl+0x1fa/0x1a70
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff81abdf24>] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [<ffffffff81abf424>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> [<ffffffff81198f33>] kvm_hv_set_msr_common+0x163/0x2a30
> arch/x86/kvm/hyperv.c:1145
> [<ffffffff810beedb>] kvm_set_msr_common+0xb0b/0x23a0 arch/x86/kvm/x86.c:2278
> [<ffffffff811cdb0d>] vmx_set_msr+0x27d/0xcb0 arch/x86/kvm/vmx.c:3149
> [<ffffffff81094479>] kvm_set_msr+0xd9/0x170 arch/x86/kvm/x86.c:1093
> [<ffffffff81094823>] do_set_msr+0x123/0x1a0 arch/x86/kvm/x86.c:1122
> [< inline >] __msr_io arch/x86/kvm/x86.c:2540
> [<ffffffff810bd060>] msr_io+0x250/0x460 arch/x86/kvm/x86.c:2577
> [<ffffffff810d9f00>] kvm_arch_vcpu_ioctl+0x360/0x4580 arch/x86/kvm/x86.c:3420
> [<ffffffff8107aab7>] kvm_vcpu_ioctl+0x237/0x11c0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2710
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff81abdf24>] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [<ffffffff81abf424>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
> [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> Later kernel detects the hang and prints:
>
> INFO: task syz-executor:6729 blocked for more than 120 seconds.
> Not tainted 4.9.0-rc8+ #77
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor D17168 6729 1 0x00000004
> ffff880037510000 ffff88006aaed1c0 ffff880066046080 ffff88006c176400
> ffff88003ed22d18 ffff8800398bef60 ffffffff8818e77c ffffffff894eefa0
> ffff88003ed236a8 ffff88003ed23680 1ffff10007317dd3 ffff88003ed22d18
> Call Trace:
> [<ffffffff8818fedd>] schedule+0x10d/0x460 kernel/sched/core.c:3457
> [<ffffffff881a1378>] schedule_timeout+0x498/0x6c0 kernel/time/timer.c:1738
> [< inline >] do_wait_for_common kernel/sched/completion.c:75
> [< inline >] __wait_for_common kernel/sched/completion.c:93
> [< inline >] wait_for_common kernel/sched/completion.c:101
> [<ffffffff88194f2f>] wait_for_completion+0x2df/0x420
> kernel/sched/completion.c:122
> [<ffffffff815ba488>] __synchronize_srcu+0x2f8/0x4a0 kernel/rcu/srcu.c:448
> [<ffffffff815ba693>] synchronize_srcu_expedited+0x13/0x20 kernel/rcu/srcu.c:510
> [<ffffffff810884db>] kvm_io_bus_register_dev+0x2ab/0x3e0
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3559
> [<ffffffff8117f4a6>] kvm_create_pit+0x5c6/0x8c0 arch/x86/kvm/i8254.c:694
> [<ffffffff810df9b3>] kvm_arch_vm_ioctl+0x14d3/0x24b0 arch/x86/kvm/x86.c:3978
> [<ffffffff8108270a>] kvm_vm_ioctl+0x1fa/0x1a70
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:3099
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff81abdf24>] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
> [< inline >] SYSC_ioctl fs/ioctl.c:694
> [<ffffffff81abf424>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
> [<ffffffff881a5f85>] entry_SYSCALL_64_fastpath+0x23/0xc6
> arch/x86/entry/entry_64.S:209
>
> Showing all locks held in the system:
> 2 locks held by khungtaskd/1330:
> #0: (rcu_read_lock){......}, at: [< inline >]
> check_hung_uninterruptible_tasks kernel/hung_task.c:168
> #0: (rcu_read_lock){......}, at: [<ffffffff816f6fdc>]
> watchdog+0x1cc/0xd70 kernel/hung_task.c:239
> #1: (tasklist_lock){.+.+..}, at: [<ffffffff81560472>]
> debug_show_all_locks+0xd2/0x420 kernel/locking/lockdep.c:4329
> 1 lock held by rsyslogd/6191:
> #0: (&f->f_pos_lock){+.+.+.}, at: [<ffffffff81aeeef0>]
> __fdget_pos+0x140/0x1b0 fs/file.c:781
> 2 locks held by getty/6279:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by getty/6280:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by getty/6281:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by getty/6282:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by getty/6283:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by getty/6284:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by bash/6294:
> #0: (&tty->ldisc_sem){++++++}, at: [<ffffffff881a32dc>]
> ldsem_down_read+0x3c/0x50 drivers/tty/tty_ldsem.c:367
> #1: (&ldata->atomic_read_lock){+.+...}, at: [<ffffffff839fcb96>]
> n_tty_read+0x316/0x14a0 drivers/tty/n_tty.c:2133
> 2 locks held by syz-executor/6729:
> #0: (&kvm->lock){+.+.+.}, at: [<ffffffff810df0f4>]
> kvm_arch_vm_ioctl+0xc14/0x24b0 arch/x86/kvm/x86.c:3973
> #1: (&kvm->slots_lock){+.+.+.}, at: [<ffffffff8117f469>]
> kvm_create_pit+0x589/0x8c0 arch/x86/kvm/i8254.c:692
> 3 locks held by syz-executor/6752:
> #0: (&vcpu->mutex){+.+.+.}, at: [<ffffffff8106c901>]
> vcpu_load+0x21/0x70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:143
> #1: (&kvm->srcu){......}, at: [< inline >] __msr_io
> arch/x86/kvm/x86.c:2538
> #1: (&kvm->srcu){......}, at: [<ffffffff810bcf58>]
> msr_io+0x148/0x460 arch/x86/kvm/x86.c:2577
> #2: (&kvm->lock){+.+.+.}, at: [<ffffffff81198f33>]
> kvm_hv_set_msr_common+0x163/0x2a30 arch/x86/kvm/hyperv.c:1145
> =============================================
>
>
> I am on commit 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7).
>