kvm: BUG in loaded_vmcs_init

From: Dmitry Vyukov
Date: Fri Jan 06 2017 - 08:04:35 EST


Hello,

The following program triggers BUG in loaded_vmcs_init when vmm_exclusive=0:

https://gist.githubusercontent.com/dvyukov/b7d05c1dc99ee25f07db786a788271e0/raw/93908cca12d92f32876f40db2dd5beddddc9d709/gistfile1.txt

kernel BUG at arch/x86/kvm/x86.c:332!
invalid opcode: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 0 PID: 2807 Comm: a.out Not tainted 4.10.0-rc2+ #148
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88006b428000 task.stack: ffff88006a700000
RIP: 0010:kvm_spurious_fault+0x9/0x10 arch/x86/kvm/x86.c:332
RSP: 0018:ffff88003fc07c68 EFLAGS: 00010006
RAX: ffff88006b428000 RBX: ffff88003fc07ca0 RCX: ffff88003eb586b0
RDX: 0000000000010000 RSI: ffff88003cb824b8 RDI: ffff88003c602000
RBP: ffff88003fc07c68 R08: ffff88003cb824b0 R09: 1ffff10007d6b0d6
R10: 0000000000000006 R11: 0000000000000000 R12: 000000003c602000
R13: 1ffff10007f80f90 R14: ffff88003c602000 R15: ffff88003fc07ce0
FS: 0000000001fbd880(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006c6ed0 CR3: 0000000004a21000 CR4: 00000000000026f0
Call Trace:
<IRQ>
bad_gs+0x20e/0x6102
loaded_vmcs_init arch/x86/kvm/vmx.c:1466 [inline]
__loaded_vmcs_clear+0x289/0x5e0 arch/x86/kvm/vmx.c:1546
flush_smp_call_function_queue+0x254/0x4c0 kernel/smp.c:234
generic_smp_call_function_single_interrupt+0x13/0x30 kernel/smp.c:183
__smp_call_function_single_interrupt arch/x86/kernel/smp.c:313 [inline]
smp_call_function_single_interrupt+0x5f/0x80 arch/x86/kernel/smp.c:320
call_function_single_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:484
RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x60
RSP: 0018:ffff88006a7072a8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff04
RAX: dffffc0000000000 RBX: 1ffff1000d4e0e5b RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffff1000d4e0e2b RDI: ffffed000d4e0e18
RBP: ffff88006a7073e0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000006 R11: 0000000000000000 R12: 0000000000000000
R13: ffff88003d36d000 R14: ffff88003d36b1a0 R15: ffff88003d36d1b8
</IRQ>
d_delete+0x1a7/0x250 fs/dcache.c:2334
__debugfs_remove.part.10+0xab/0xf0 fs/debugfs/inode.c:595
__debugfs_remove include/linux/dcache.h:485 [inline]
debugfs_remove_recursive+0x22e/0x5e0 fs/debugfs/inode.c:678
kvm_destroy_vm_debugfs arch/x86/kvm/../../../virt/kvm/kvm_main.c:563 [inline]
kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:722 [inline]
kvm_put_kvm+0x137/0x990 arch/x86/kvm/../../../virt/kvm/kvm_main.c:757
kvm_vm_release+0x42/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:768
__fput+0x332/0x7f0 fs/file_table.c:208
____fput+0x15/0x20 fs/file_table.c:244
task_work_run+0x18a/0x260 kernel/task_work.c:116
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0x18e7/0x28a0 kernel/exit.c:839
do_group_exit+0x149/0x420 kernel/exit.c:943
SYSC_exit_group kernel/exit.c:954 [inline]
SyS_exit_group+0x1d/0x20 kernel/exit.c:952
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x439ca9
RSP: 002b:00007ffe7f572e08 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00000000006c74e0 RCX: 0000000000439ca9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 000000000000003c R09: 00000000000000e7
R10: ffffffffffffffc0 R11: 0000000000000246 R12: 0000000000000000
R13: 00000000006c74e0 R14: 0000000000407140 R15: 0000000000000000
Code: ff ff e8 6b a2 91 00 eb 88 e8 64 a2 91 00 e9 59 ff ff ff 66 66
66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 e8 f7 f7 63 00 <0f>
0b 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 41 89
RIP: kvm_spurious_fault+0x9/0x10 arch/x86/kvm/x86.c:332 RSP: ffff88003fc07c68
---[ end trace ac4be7c8a8506d04 ]---


On commit e02003b515e8d95f40f20f213622bb82510873d2 (Jan 4).

The code blobs that run in VM are generated from:
https://gist.githubusercontent.com/dvyukov/a74e0ba6e74a921f9467d1d7d93714fb/raw/cceda361b83ac747c2a52d88c49d29498d6f8d0b/gistfile1.txt
It's pretty much straightforward setup of restricted guest. I think
what's important here is only vmm_exclusive=0 + nested VM created in
parallel processes.


When I enabled vmm_exclusive=0 I also started seeing lots of:
KASAN: use-after-free Write in __loaded_vmcs_clear
and:
KASAN: use-after-free Write in vmx_vcpu_load

Probably they are caused by the same root cause (some mishandling of
vmcs lists).


BUG: KASAN: use-after-free in __list_del include/linux/list.h:104
[inline] at addr ffff880038dced70
BUG: KASAN: use-after-free in __list_del_entry
include/linux/list.h:119 [inline] at addr ffff880038dced70
BUG: KASAN: use-after-free in list_del include/linux/list.h:124
[inline] at addr ffff880038dced70
BUG: KASAN: use-after-free in __loaded_vmcs_clear+0x58e/0x5e0
arch/x86/kvm/vmx.c:1536 at addr ffff880038dced70
Write of size 8 by task syz-executor3/5003
CPU: 1 PID: 5003 Comm: syz-executor3 Not tainted 4.10.0-rc2+ #148
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x292/0x3a2 lib/dump_stack.c:51
kasan_object_err+0x1c/0x70 mm/kasan/report.c:165
print_address_description mm/kasan/report.c:203 [inline]
kasan_report_error mm/kasan/report.c:287 [inline]
kasan_report+0x1b6/0x460 mm/kasan/report.c:307
__asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:338
__list_del include/linux/list.h:104 [inline]
__list_del_entry include/linux/list.h:119 [inline]
list_del include/linux/list.h:124 [inline]
__loaded_vmcs_clear+0x58e/0x5e0 arch/x86/kvm/vmx.c:1536
vmx_vcpu_put+0xa7/0x120 arch/x86/kvm/vmx.c:2344
kvm_arch_vcpu_put+0x1f4/0x3b0 arch/x86/kvm/x86.c:2871
kvm_sched_out+0x87/0xa0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3908
__fire_sched_out_preempt_notifiers kernel/sched/core.c:2663 [inline]
fire_sched_out_preempt_notifiers kernel/sched/core.c:2671 [inline]
prepare_task_switch kernel/sched/core.c:2707 [inline]
context_switch kernel/sched/core.c:2868 [inline]
__schedule+0xbe4/0x1e40 kernel/sched/core.c:3403
preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3513
_cond_resched+0x17/0x20 kernel/sched/core.c:4907
vcpu_run arch/x86/kvm/x86.c:6976 [inline]
kvm_arch_vcpu_ioctl_run+0x12ea/0x4660 arch/x86/kvm/x86.c:7108
kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
vfs_ioctl fs/ioctl.c:43 [inline]
do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
SYSC_ioctl fs/ioctl.c:698 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4448f9
RSP: 002b:00007fc44142fb58 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000015 RCX: 00000000004448f9
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000015
RBP: 00000000006deb30 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000700000
R13: 00007fc441bdd238 R14: 00007fc441bf28f0 R15: 0000000000000000
Object at ffff880038dca700, in cache kvm_vcpu size: 19392
Allocated:
PID = 5033
[<ffffffff8128e2e6>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
[<ffffffff819ca403>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502
[<ffffffff819ca6ca>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff819ca6ca>] kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:605
[<ffffffff819cacc2>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
[<ffffffff819c64f1>] kmem_cache_alloc+0xe1/0x630 mm/slab.c:3563
[<ffffffff811b0ca5>] kmem_cache_zalloc include/linux/slab.h:626 [inline]
[<ffffffff811b0ca5>] vmx_create_vcpu+0xf5/0x2dd0 arch/x86/kvm/vmx.c:9304
[<ffffffff810e53e8>] kvm_arch_vcpu_create+0x138/0x1b0 arch/x86/kvm/x86.c:7543
[<ffffffff8107d6ca>] kvm_vm_ioctl_create_vcpu
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2458 [inline]
[<ffffffff8107d6ca>] kvm_vm_ioctl+0x4da/0x1d00
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2960
[<ffffffff81a5d7cf>] vfs_ioctl fs/ioctl.c:43 [inline]
[<ffffffff81a5d7cf>] do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
[<ffffffff81a5ee1f>] SYSC_ioctl fs/ioctl.c:698 [inline]
[<ffffffff81a5ee1f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
[<ffffffff8412d441>] entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID =5035
[<ffffffff8128e2e6>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
[<ffffffff819ca403>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502
[<ffffffff819cad3f>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff819cad3f>] kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:578
[<ffffffff819c8191>] __cache_free mm/slab.c:3505 [inline]
[<ffffffff819c8191>] kmem_cache_free+0x51/0x210 mm/slab.c:3765
[<ffffffff811a9743>] vmx_free_vcpu+0x203/0x270 arch/x86/kvm/vmx.c:9298
[<ffffffff810e8a6f>] kvm_arch_vcpu_free arch/x86/kvm/x86.c:7529 [inline]
[<ffffffff810e8a6f>] kvm_free_vcpus arch/x86/kvm/x86.c:7958 [inline]
[<ffffffff810e8a6f>] kvm_arch_destroy_vm+0x50f/0xa00 arch/x86/kvm/x86.c:8059
[<ffffffff8106b7ae>] kvm_destroy_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:736 [inline]
[<ffffffff8106b7ae>] kvm_put_kvm+0x4ee/0x990
arch/x86/kvm/../../../virt/kvm/kvm_main.c:757
[<ffffffff8106bda2>] kvm_vm_release+0x42/0x50
arch/x86/kvm/../../../virt/kvm/kvm_main.c:768
[<ffffffff81a18162>] __fput+0x332/0x7f0 fs/file_table.c:208
[<ffffffff81a186a5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff814947fa>] task_work_run+0x18a/0x260 kernel/task_work.c:116
[<ffffffff81420527>] exit_task_work include/linux/task_work.h:21 [inline]
[<ffffffff81420527>] do_exit+0x18e7/0x28a0 kernel/exit.c:839
[<ffffffff81425f39>] do_group_exit+0x149/0x420 kernel/exit.c:943
[<ffffffff81454410>] get_signal+0x7e0/0x1820 kernel/signal.c:2313
[<ffffffff8125a4f2>] do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
[<ffffffff81005ce0>] exit_to_usermode_loop+0x170/0x200
arch/x86/entry/common.c:156
[<ffffffff810091f3>] prepare_exit_to_usermode
arch/x86/entry/common.c:190 [inline]
[<ffffffff810091f3>] syscall_return_slowpath+0x3d3/0x420
arch/x86/entry/common.c:259



BUG: KASAN: use-after-free in __list_add include/linux/list.h:62
[inline] at addr ffff8800618bf0b0
BUG: KASAN: use-after-free in list_add include/linux/list.h:78
[inline] at addr ffff8800618bf0b0
BUG: KASAN: use-after-free in vmx_vcpu_load+0xa21/0xac0
arch/x86/kvm/vmx.c:2285 at addr ffff8800618bf0b0
Write of size 8 by task syz-executor1/7613
CPU: 2 PID: 7613 Comm: syz-executor1 Not tainted 4.10.0-rc2+ #148
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x292/0x3a2 lib/dump_stack.c:51
kasan_object_err+0x1c/0x70 mm/kasan/report.c:165
print_address_description mm/kasan/report.c:203 [inline]
kasan_report_error mm/kasan/report.c:287 [inline]
kasan_report+0x1b6/0x460 mm/kasan/report.c:307
__asan_report_store8_noabort+0x17/0x20 mm/kasan/report.c:338
__list_add include/linux/list.h:62 [inline]
list_add include/linux/list.h:78 [inline]
vmx_vcpu_load+0xa21/0xac0 arch/x86/kvm/vmx.c:2285
kvm_arch_vcpu_load+0x144/0x910 arch/x86/kvm/x86.c:2799
vcpu_load+0x4b/0x70 arch/x86/kvm/../../../virt/kvm/kvm_main.c:150
kvm_vcpu_ioctl+0x1b5/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2551
vfs_ioctl fs/ioctl.c:43 [inline]
do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
SYSC_ioctl fs/ioctl.c:698 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4447a7
RSP: 002b:00007f0dc1833538 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000fec00000 RCX: 00000000004447a7
RDX: 00007f0dc1833880 RSI: 000000008138ae83 RDI: 0000000000000015
RBP: 0000000000000010 R08: 0000000020015fb9 R09: 0000000000000047
R10: 00007f0dc18389d0 R11: 0000000000000217 R12: 000000000000000f
R13: 0000000000000006 R14: 0000000000000015 R15: 0000000020004000
Object at ffff8800618baa40, in cache kvm_vcpu size: 19392
Allocated:
PID = 7647
[<ffffffff8128e2e6>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
[<ffffffff819ca403>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502
[<ffffffff819ca6ca>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff819ca6ca>] kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:605
[<ffffffff819cacc2>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
[<ffffffff819c64f1>] kmem_cache_alloc+0xe1/0x630 mm/slab.c:3563
[<ffffffff811b0ca5>] kmem_cache_zalloc include/linux/slab.h:626 [inline]
[<ffffffff811b0ca5>] vmx_create_vcpu+0xf5/0x2dd0 arch/x86/kvm/vmx.c:9304
[<ffffffff810e53e8>] kvm_arch_vcpu_create+0x138/0x1b0 arch/x86/kvm/x86.c:7543
[<ffffffff8107d6ca>] kvm_vm_ioctl_create_vcpu
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2458 [inline]
[<ffffffff8107d6ca>] kvm_vm_ioctl+0x4da/0x1d00
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2960
[<ffffffff81a5d7cf>] vfs_ioctl fs/ioctl.c:43 [inline]
[<ffffffff81a5d7cf>] do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
[<ffffffff81a5ee1f>] SYSC_ioctl fs/ioctl.c:698 [inline]
[<ffffffff81a5ee1f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
[<ffffffff8412d441>] entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 7664
[<ffffffff8128e2e6>] save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
[<ffffffff819ca403>] save_stack+0x43/0xd0 mm/kasan/kasan.c:502
[<ffffffff819cad3f>] set_track mm/kasan/kasan.c:514 [inline]
[<ffffffff819cad3f>] kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:578
[<ffffffff819c8191>] __cache_free mm/slab.c:3505 [inline]
[<ffffffff819c8191>] kmem_cache_free+0x51/0x210 mm/slab.c:3765
[<ffffffff811a9743>] vmx_free_vcpu+0x203/0x270 arch/x86/kvm/vmx.c:9298
[<ffffffff810e8a6f>] kvm_arch_vcpu_free arch/x86/kvm/x86.c:7529 [inline]
[<ffffffff810e8a6f>] kvm_free_vcpus arch/x86/kvm/x86.c:7958 [inline]
[<ffffffff810e8a6f>] kvm_arch_destroy_vm+0x50f/0xa00 arch/x86/kvm/x86.c:8059
[<ffffffff8106b7ae>] kvm_destroy_vm
arch/x86/kvm/../../../virt/kvm/kvm_main.c:736 [inline]
[<ffffffff8106b7ae>] kvm_put_kvm+0x4ee/0x990
arch/x86/kvm/../../../virt/kvm/kvm_main.c:757
[<ffffffff8106bda2>] kvm_vm_release+0x42/0x50
arch/x86/kvm/../../../virt/kvm/kvm_main.c:768
[<ffffffff81a18162>] __fput+0x332/0x7f0 fs/file_table.c:208
[<ffffffff81a186a5>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff814947fa>] task_work_run+0x18a/0x260 kernel/task_work.c:116
[<ffffffff81420527>] exit_task_work include/linux/task_work.h:21 [inline]
[<ffffffff81420527>] do_exit+0x18e7/0x28a0 kernel/exit.c:839
[<ffffffff81425f39>] do_group_exit+0x149/0x420 kernel/exit.c:943
[<ffffffff81454410>] get_signal+0x7e0/0x1820 kernel/signal.c:2313
[<ffffffff8125a4f2>] do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
[<ffffffff81005ce0>] exit_to_usermode_loop+0x170/0x200
arch/x86/entry/common.c:156
[<ffffffff810091f3>] prepare_exit_to_usermode
arch/x86/entry/common.c:190 [inline]
[<ffffffff810091f3>] syscall_return_slowpath+0x3d3/0x420
arch/x86/entry/common.c:259
[<ffffffff8412d4e2>] entry_SYSCALL_64_fastpath+0xc0/0xc2