kvm: assorted bugs after OOMs

From: Dmitry Vyukov
Date: Sun Dec 11 2016 - 03:40:24 EST


Hello,

When I run the following program on a VM with 2GB of memory:

https://gist.githubusercontent.com/dvyukov/8ccb55ae7da35ac780d55efc07a058a6/raw/cd5cadc1226a132b8d2731eaf8b03ac4d52e2763/gistfile1.txt

it starts causing OOM kills as each process allocates ~1GB, and
eventually if kills the VM. However, if I keep rebooting the machine
and running the program again, I also see other assorted bugs between
the OOM messages every other run.

Here is one:

------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu.c:1130!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 0 PID: 6297 Comm: a.out Not tainted 4.9.0-rc8+ #80
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff8800345728c0 task.stack: ffff880034698000
RIP: 0010:[<ffffffff81102001>] [<ffffffff81102001>]
rmap_get_first+0xf1/0x190 arch/x86/kvm/mmu.c:1130
RSP: 0000:ffff88003469f3c8 EFLAGS: 00010297
RAX: ffff8800345728c0 RBX: ffff88003c4b4840 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88003469f420 RDI: ffffc900214bcd08
RBP: ffff88003469f3e0 R08: 0000000000000001 R09: 0000000000000000
R10: f778ebb8de62db96 R11: 0000000000000004 R12: 0000000000000000
R13: ffffc900214bcd08 R14: ffff8800345728c0 R15: 0000000000000000
FS: 0000000001447880(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa875083900 CR3: 00000000369e7000 CR4: 00000000000026f0
Stack:
ffffc900214bcd08 ffff88003469f420 dffffc0000000000 ffff88003469f488
ffffffff81104bec 1ffff100068d3e80 ffff88003469f420 0000000041b58ab3
ffffffff894d1ec5 ffffffff81104b60 0000000000000000 0000000000000000
Call Trace:
[<ffffffff81104bec>] __rmap_write_protect.constprop.115+0x8c/0x1a0
arch/x86/kvm/mmu.c:1235
[<ffffffff81104d1c>] slot_rmap_write_protect+0x1c/0x20 arch/x86/kvm/mmu.c:4733
[<ffffffff81100dd9>] slot_handle_level_range+0x1f9/0x380
arch/x86/kvm/mmu.c:4650
[< inline >] slot_handle_level arch/x86/kvm/mmu.c:4674
[< inline >] slot_handle_all_level arch/x86/kvm/mmu.c:4684
[<ffffffff8112eab4>] kvm_mmu_slot_remove_write_access+0x94/0x150
arch/x86/kvm/mmu.c:4742
[< inline >] kvm_mmu_slot_apply_flags arch/x86/kvm/x86.c:8088
[<ffffffff810ee304>] kvm_arch_commit_memory_region+0x144/0x270
arch/x86/kvm/x86.c:8175
[<ffffffff8108173a>] __kvm_set_memory_region+0x18aa/0x2630
arch/x86/kvm/../../../virt/kvm/kvm_main.c:1040
[<ffffffff810824f3>] kvm_set_memory_region+0x33/0x50
arch/x86/kvm/../../../virt/kvm/kvm_main.c:1076
[< inline >] kvm_vm_ioctl_set_memory_region
arch/x86/kvm/../../../virt/kvm/kvm_main.c:1088
[<ffffffff8108372e>] kvm_vm_ioctl+0x121e/0x1a70
arch/x86/kvm/../../../virt/kvm/kvm_main.c:2960
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff81abdf94>] do_vfs_ioctl+0x1c4/0x1630 fs/ioctl.c:679
[< inline >] SYSC_ioctl fs/ioctl.c:694
[<ffffffff81abf494>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:685
[<ffffffff8816c405>] entry_SYSCALL_64_fastpath+0x23/0xc6
arch/x86/entry/entry_64.S:209
Code: 6d 5e 00 48 89 da 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80
3c 02 00 0f 85 98 00 00 00 4c 8b 23 45 85 e4 75 31 e8 7f 6d 5e 00 <0f>
0b e8 78 6d 5e 00 4c 89 e2 48 b8 00 00 00 00 00 fc ff df 48
RIP [<ffffffff81102001>] rmap_get_first+0xf1/0x190 arch/x86/kvm/mmu.c:1130
RSP <ffff88003469f3c8>
---[ end trace 14ff65581fafb17b ]---


Here is another:

------------[ cut here ]------------
WARNING: CPU: 3 PID: 37 at arch/x86/kernel/smp.c:127[< none
>] native_smp_send_reschedule+0xae/0xd0 arch/x86/kernel/smp.c:127
Modules linked in:
CPU: 3 PID: 37 Comm: kworker/u10:1 Not tainted 4.9.0-rc8+ #80
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: writeback wb_workfn (flush-8:0)
ffff88006d1076e8 ffffffff8348fbd9 ffffffff00000003 1ffff1000da20e70
ffffed000da20e68 0000000041b58ab3 ffffffff8957cf20 ffffffff8348f8eb
ffffffff818784e3 ffff8800354b46c0 0000000000000000 0000000000000003
Call Trace:
<IRQ> [ 27.106006] [<ffffffff8348fbd9>] dump_stack+0x2ee/0x3f5
[<ffffffff813fca54>] __warn+0x1a4/0x1e0 kernel/panic.c:550
[<ffffffff813fcce1>] warn_slowpath_null+0x31/0x40 kernel/panic.c:585
[<ffffffff812b6abe>] native_smp_send_reschedule+0xae/0xd0
arch/x86/kernel/smp.c:127
[< inline >] smp_send_reschedule ./arch/x86/include/asm/smp.h:115
[< inline >] ttwu_queue_remote kernel/sched/core.c:1845
[< inline >] ttwu_queue kernel/sched/core.c:1889
[<ffffffff814c03e0>] try_to_wake_up+0xcb0/0x1080 kernel/sched/core.c:2095
[<ffffffff814c08e5>] default_wake_function+0x35/0x50 kernel/sched/core.c:3626
[<ffffffff8153ab2e>] autoremove_wake_function+0x5e/0x130
kernel/sched/wait.c:281
[<ffffffff815388f9>] __wake_up_common+0xb9/0x150 kernel/sched/wait.c:73
[<ffffffff81538a7a>] __wake_up+0xea/0x180 kernel/sched/wait.c:95
[<ffffffff81591dbf>] wake_up_klogd_work_func+0x4f/0x80
kernel/printk/printk.c:2866
[<ffffffff817c9186>] irq_work_run_list+0x1c6/0x2c0 kernel/irq_work.c:156
[<ffffffff817c92b1>] irq_work_run+0x31/0x40 kernel/irq_work.c:172
[<ffffffff8164204e>] flush_smp_call_function_queue+0x31e/0x550 kernel/smp.c:245
[<ffffffff81645028>]
generic_smp_call_function_single_interrupt+0x18/0x80 kernel/smp.c:180
[< inline >] __smp_call_function_single_interrupt
arch/x86/kernel/smp.c:315
[<ffffffff812b83d4>] smp_call_function_single_interrupt+0x64/0x90
arch/x86/kernel/smp.c:322
[<ffffffff812b840e>] smp_call_function_interrupt+0xe/0x10
./arch/x86/include/asm/apic.h:403
[<ffffffff8816e89c>] call_function_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:489
<EOI> [ 27.106006] [<ffffffff81849c64>] ? panic+0x3cc/0x425
[< inline >] schedule_debug kernel/sched/core.c:3241
[<ffffffff88155cea>] __schedule+0x184a/0x1db0 kernel/sched/core.c:3345
[<ffffffff881571b5>] preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3512
[<ffffffff881571fc>] _cond_resched+0x1c/0x30 kernel/sched/core.c:4906
[<ffffffff881653b1>] down_write+0x71/0x120 kernel/locking/rwsem.c:51
[< inline >] mpage_map_and_submit_extent fs/ext4/inode.c:2458
[<ffffffff81ddae60>] ext4_writepages+0x27c0/0x3820 fs/ext4/inode.c:2755
[<ffffffff818a27ff>] do_writepages+0x9f/0x100 mm/page-writeback.c:2347
[<ffffffff81b326d4>] __writeback_single_inode+0x1f4/0x1550
fs/fs-writeback.c:1305
[<ffffffff81b35667>] writeback_sb_inodes+0x897/0x14a0 fs/fs-writeback.c:1569
[<ffffffff81b36440>] __writeback_inodes_wb+0x1d0/0x360 fs/fs-writeback.c:1638
[<ffffffff81b371f5>] wb_writeback+0x885/0xfc0 fs/fs-writeback.c:1747
[<ffffffff81b37bd1>] wb_do_writeback+0x2a1/0xad0 fs/fs-writeback.c:1882
[<ffffffff81b3a91b>] wb_workfn+0x25b/0xb00 fs/fs-writeback.c:1922
[<ffffffff814731c0>] process_one_work+0xb40/0x1ba0 kernel/workqueue.c:2096
[<ffffffff81474434>] worker_thread+0x214/0x18a0 kernel/workqueue.c:2230
[<ffffffff8148a058>] kthread+0x328/0x3e0 kernel/kthread.c:209
[<ffffffff8816c69a>] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
---[ end trace f8601d619592d8d2 ]---


Here is another:

BUG: Bad rss-counter state mm:ffff8800379149c0 idx:0 val:32
BUG: Bad rss-counter state mm:ffff8800674c4100 idx:0 val:32
at kernel/fork.c:807


The bugs seem to be better reproducible if you run the following
syzkaller program following these instructions:
https://github.com/google/syzkaller/wiki/How-to-execute-syzkaller-programs
run as:
./syz-execprog -threaded=0 -procs=6 -repeat=0 -sandbox=none prog
Depending on amount of memory in your VM you may need to adjust number
of parallel subprocesses in the C program, or -procs argument of
syz-execprog, so that it starts triggering OOMs. The provided numbers
work for VM with 2 GB of memory.

mmap(&(0x7f0000000000/0x23000)=nil, (0x23000), 0x3, 0x32,
0xffffffffffffffff, 0x0)
r0 = syz_open_dev$kvm(&(0x7f000000c000)="2f6465762f6b766d00", 0x0, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
ioctl$KVM_SET_USER_MEMORY_REGION(r1, 0x4020ae46,
&(0x7f0000023000-0x20)={0x0, 0x2, 0x0, 0x1000,
&(0x7f0000023000-0x14)=@gdt=[{0x2a}, {0x3f}, {0x6}, {0x3}]})


On commit 318c8932ddec5c1c26a4af0f3c053784841c598e (Dec 7).