Re: kvm: GPF in kvm_lapic_latched_init

From: Jeff Merkey
Date: Fri Jan 15 2016 - 14:59:18 EST


On 1/8/16, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> Hello,
>
> The following program triggers GPF in kvm_lapic_latched_init if run in
> a parallel loop:
> https://gist.githubusercontent.com/dvyukov/524b398f379440b21115/raw/9627095f57a72501fb51bf7565471d31732beeee/gistfile1.txt
>
> kasan: GPF could be caused by NULL-ptr deref or user memory
> accessgeneral protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
> Modules linked in:
> CPU: 3 PID: 14426 Comm: a.out Not tainted 4.4.0-rc8+ #217
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
> 01/01/2011
> task: ffff880061099780 ti: ffff880062e30000 task.ti: ffff880062e30000
> RIP: 0010:[<ffffffff81057171>] [<ffffffff81057171>]
> kvm_arch_vcpu_ioctl+0xa31/0x2ef0
> RSP: 0018:ffff880062e37900 EFLAGS: 00010206
> RAX: dffffc0000000000 RBX: 1ffff1000c5c6f25 RCX: 1ffff1000c41b7cb
> RDX: 000000000000001e RSI: 000000008040ae9f RDI: 00000000000000f0
> RBP: ffff880062e37c10 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000000 R14: ffff880062e37be8 R15: 0000000000000000
> FS: 00007f4aa815f700(0000) GS:ffff88006d700000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f4aa795de78 CR3: 00000000613c2000 CR4: 00000000000026e0
> Stack:
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000020006fe4 0000000041b58ab3 ffffffff86e2e588 ffffffff81056740
> 0000000000000001 ffff880061099f60 0000000000000498 ffff880061099f68
> Call Trace:
> [<ffffffff8101cb52>] kvm_vcpu_ioctl+0x1e2/0xd00
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2526
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff817b36b1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> [< inline >] SYSC_ioctl fs/ioctl.c:622
> [<ffffffff817b3eff>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> [<ffffffff85e745b6>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> Code: 85 2d 20 00 00 4d 8b a4 24 60 03 00 00 e8 c8 8b 50 00 49 8d bc
> 24 f0 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80>
> 3c 02 00 0f 85 f3 1f 00 00 4d 8b a4 24 f0 00 00 00 41 83 e4
> RIP [< inline >] constant_test_bit
> ./arch/x86/include/asm/bitops.h:311
> RIP [< inline >] kvm_lapic_latched_init arch/x86/kvm/lapic.h:164
> RIP [< inline >] kvm_vcpu_ioctl_x86_get_vcpu_events
> arch/x86/kvm/x86.c:2936
> RIP [<ffffffff81057171>] kvm_arch_vcpu_ioctl+0xa31/0x2ef0
> arch/x86/kvm/x86.c:3347
> RSP <ffff880062e37900>
> ---[ end trace 16449377928e034b ]---
>
>
> or:
>
> kasan: GPF could be caused by NULL-ptr deref or user memory
> accessgeneral protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
> Modules linked in:
> CPU: 0 PID: 9555 Comm: syz-executor Not tainted 4.4.0-rc8+ #217
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
> 01/01/2011
> task: ffff88006301de00 ti: ffff880062568000 task.ti: ffff880062568000
> RIP: 0010:[<ffffffff810cf5ab>] [<ffffffff810cf5ab>]
> wait_lapic_expire+0x6b/0x560
> RSP: 0018:ffff88006256fa48 EFLAGS: 00010006
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff88006301e5c8
> RDX: 0000000000000011 RSI: 0000000000000000 RDI: ffff880033590360
> RBP: ffff88006256fa88 R08: 0000000000000001 R09: 0000000000000002
> R10: 0000000000000001 R11: 0000000000000001 R12: ffff880033590000
> R13: ffff880033590030 R14: 0000000000000088 R15: ffff88003359002c
> FS: 00007f4809354700(0000) GS:ffff88003ec00000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4808b53000 CR3: 0000000033f3f000 CR4: 00000000000026f0
> Stack:
> ffff88006256fa70 0000000000000082 0000000000000003 ffff88006301de00
> ffff880033590030 ffff880033590030 ffff880033590000 ffff88003359002c
> ffff88006256fc10 ffffffff8106a1dc ffffffff8106a75b 0000000000013210
> Call Trace:
> [< inline >] vcpu_enter_guest arch/x86/kvm/x86.c:6523
> [< inline >] vcpu_run arch/x86/kvm/x86.c:6660
> [<ffffffff8106a1dc>] kvm_arch_vcpu_ioctl_run+0x25ec/0x5820
> arch/x86/kvm/x86.c:6818
> [<ffffffff8101cf61>] kvm_vcpu_ioctl+0x5f1/0xd00
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2375
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff817b36b1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> [< inline >] SYSC_ioctl fs/ioctl.c:622
> [<ffffffff817b3eff>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> [<ffffffff85e745b6>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> Code: 60 03 00 00 0f 1f 44 00 00 e8 92 07 49 00 4c 8d b3 88 00 00 00
> e8 86 07 49 00 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
> 3c 02 00 0f 85 d8 04 00 00 4c 8b ab 88 00 00 00 4d 85 ed 75
> RIP [<ffffffff810cf5ab>] wait_lapic_expire+0x6b/0x560
> arch/x86/kvm/lapic.c:1245
> RSP <ffff88006256fa48>
> ---[ end trace 560c2b85e36670bc ]---
>
> or:
>
> kasan: GPF could be caused by NULL-ptr deref or user memory
> accessgeneral protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
> Modules linked in:
> CPU: 3 PID: 11264 Comm: syz-executor Not tainted 4.4.0-rc8+ #217
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
> 01/01/2011
> task: ffff880064d55e00 ti: ffff880064dc0000 task.ti: ffff880064dc0000
> RIP: 0010:[<ffffffff810d138d>] [<ffffffff810d138d>]
> apic_has_pending_timer+0x7d/0x210
> RSP: 0018:ffff880064dc7a60 EFLAGS: 00010206
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000004
> RDX: 0000000000000017 RSI: 0000000000000000 RDI: 00000000000000b8
> RBP: ffff880064dc7a70 R08: 0000000000000002 R09: 0000000000000001
> R10: ffff880064d55e00 R11: ffff880063528220 R12: ffff880063250030
> R13: ffff880063250030 R14: ffff880063250000 R15: 0000000000000000
> FS: 00007fb05f305700(0000) GS:ffff88006d700000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000006d7760 CR3: 0000000065ae9000 CR4: 00000000000026e0
> Stack:
> ffff880063250000 ffff880063250030 ffff880064dc7a88 ffffffff810c7af5
> ffffffff86fee5c0 ffff880064dc7c10 ffffffff810685d4 ffffffff8106a75b
> 0000000000013210 ffff880065a35000 1ffff1000c9b8f59 ffff880064dc0008
> Call Trace:
> [<ffffffff810c7af5>] kvm_cpu_has_pending_timer+0x15/0x20
> arch/x86/kvm/irq.c:36
> [< inline >] vcpu_run arch/x86/kvm/x86.c:6669
> [<ffffffff810685d4>] kvm_arch_vcpu_ioctl_run+0x9e4/0x5820
> arch/x86/kvm/x86.c:6818
> [<ffffffff8101cf61>] kvm_vcpu_ioctl+0x5f1/0xd00
> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2375
> [< inline >] vfs_ioctl fs/ioctl.c:43
> [<ffffffff817b36b1>] do_vfs_ioctl+0x681/0xe40 fs/ioctl.c:607
> [< inline >] SYSC_ioctl fs/ioctl.c:622
> [<ffffffff817b3eff>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:613
> [<ffffffff85e745b6>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> Code: ba e9 48 00 0f 1f 44 00 00 e8 b0 e9 48 00 e8 ab e9 48 00 48 8d
> bb b8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80>
> 3c 02 00 0f 85 46 01 00 00 4c 8b a3 b8 00 00 00 48 b8 00 00
> RIP [< inline >] arch_static_branch
> ./arch/x86/include/asm/jump_label.h:21
> RIP [< inline >] static_key_false include/linux/jump_label.h:133
> RIP [< inline >] kvm_apic_hw_enabled arch/x86/kvm/lapic.h:117
> RIP [< inline >] apic_enabled arch/x86/kvm/lapic.c:121
> RIP [<ffffffff810d138d>] apic_has_pending_timer+0x7d/0x210
> arch/x86/kvm/lapic.c:1731
> RSP <ffff880064dc7a60>
> ---[ end trace fe9c10b88e48c946 ]---
>
>
> All crashes suggest that apic is NULL.
>
> On commit b06f3a168cdcd80026276898fd1fee443ef25743 (Jan 6).
>

Dmitry,

You need to check your test harness and add checks for which CPL the
kernel is running at for these GPF faults and add that to your report.
I realize that there are a lot of kernel subsystems which are coded
very loose on checking for this stuff. I have looked through some of
these hangs you reported and I think one of them is related to a
swapgs instruction getting nested, and two others related to code
touching hardware.

Can you figure out how to send the info as to what privilege level you
are at when these faults occur? This one looks like swapgs got nested
and gs was pointing off to oblivion.

:-)

Jeff