Re: [syzbot] [kvm?] general protection fault in is_page_fault_stale

From: Sean Christopherson
Date: Mon Jul 22 2024 - 12:08:38 EST


On Mon, Jul 22, 2024, syzbot wrote:
> Oops: general protection fault, probably for non-canonical address 0xe000013ffffffffd: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: maybe wild-memory-access in range [0x000029ffffffffe8-0x000029ffffffffef]
> CPU: 0 PID: 11829 Comm: syz.1.1799 Not tainted 6.10.0-syzkaller-11185-g2c9b3512402e #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
> RIP: 0010:to_shadow_page arch/x86/kvm/mmu/spte.h:245 [inline]
> RIP: 0010:spte_to_child_sp arch/x86/kvm/mmu/spte.h:250 [inline]
> RIP: 0010:root_to_sp arch/x86/kvm/mmu/spte.h:267 [inline]
> RIP: 0010:is_page_fault_stale+0xc4/0x530 arch/x86/kvm/mmu/mmu.c:4517
> Code: e9 00 01 00 00 48 b8 ff ff ff ff ff 00 00 00 48 21 c3 48 c1 e3 06 49 bc 28 00 00 00 00 ea ff ff 49 01 dc 4c 89 e0 48 c1 e8 03 <42> 80 3c 28 00 74 08 4c 89 e7 e8 6d b7 d8 00 4d 8b 2c 24 31 ff 4c
> RSP: 0018:ffffc9000fc6f6f0 EFLAGS: 00010202
> RAX: 0000053ffffffffd RBX: 00003fffffffffc0 RCX: ffff88806a8bda00
> RDX: 0000000000000000 RSI: 000fffffffffffff RDI: 00000000000129d3
> RBP: 00000000000129d3 R08: ffffffff8120c8e0 R09: 1ffff920005e6c00
> R10: dffffc0000000000 R11: fffff520005e6c01 R12: 000029ffffffffe8
> R13: dffffc0000000000 R14: ffffc9000fc6f800 R15: ffff88807cbed000
> FS: 0000000000000000(0000) GS:ffff8880b9400000(0063) knlGS:00000000f5d46b40
> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 00000000576d24c0 CR3: 000000007d930000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> kvm_tdp_mmu_page_fault arch/x86/kvm/mmu/mmu.c:4662 [inline]
> kvm_tdp_page_fault+0x25c/0x320 arch/x86/kvm/mmu/mmu.c:4693
> kvm_mmu_do_page_fault+0x589/0xca0 arch/x86/kvm/mmu/mmu_internal.h:323
> kvm_tdp_map_page arch/x86/kvm/mmu/mmu.c:4715 [inline]
> kvm_arch_vcpu_pre_fault_memory+0x2db/0x5a0 arch/x86/kvm/mmu/mmu.c:4760
> kvm_vcpu_pre_fault_memory+0x24c/0x4b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4418
> kvm_vcpu_ioctl+0xa47/0xea0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4648
> kvm_vcpu_compat_ioctl+0x242/0x450 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4700
> __do_compat_sys_ioctl fs/ioctl.c:1007 [inline]
> __se_compat_sys_ioctl+0x51c/0xca0 fs/ioctl.c:950
> do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
> __do_fast_syscall_32+0xb4/0x110 arch/x86/entry/common.c:386
> do_fast_syscall_32+0x34/0x80 arch/x86/entry/common.c:411
> entry_SYSENTER_compat_after_hwframe+0x84/0x8e

The amount sanitizer code in play makes it difficult to read the assembly, but
unless I'm misreading things the explosion happens on

return (struct kvm_mmu_page *)page_private(page);

which suggests that vcpu->arch.mmu->root.hpa is garbage. Lo and behold! Just
before the explosion, there's a malloc() injection during kvm_mmu_load().

Not sure why syzbot can't get a repro, but I'm pretty confident the bug is that
kvm_arch_vcpu_pre_fault_memory() doesn't check the result of kvm_mmu_reload().

I'll send this after a bit of testing:

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 901be9e420a4..ee516baf3a31 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4747,7 +4747,9 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
* reload is efficient when called repeatedly, so we can do it on
* every iteration.
*/
- kvm_mmu_reload(vcpu);
+ r = kvm_mmu_reload(vcpu);
+ if (r)
+ return r;

if (kvm_arch_has_private_mem(vcpu->kvm) &&
kvm_mem_is_private(vcpu->kvm, gpa_to_gfn(range->gpa)))


[ 363.075965][T11829] FAULT_INJECTION: forcing a failure.
[ 363.075965][T11829] name failslab, interval 1, probability 0, space 0, times 0
[ 363.089953][ T53] vhci_hcd: release socket
[ 363.094422][ T53] vhci_hcd: disconnect device
[ 363.117979][T11829] CPU: 0 PID: 11829 Comm: syz.1.1799 Not tainted 6.10.0-syzkaller-11185-g2c9b3512402e #0
[ 363.127841][T11829] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
[ 363.137917][T11829] Call Trace:
[ 363.141207][T11829] <TASK>
[ 363.144137][T11829] dump_stack_lvl+0x241/0x360
[ 363.148907][T11829] ? __pfx_dump_stack_lvl+0x10/0x10
[ 363.154103][T11829] ? __pfx__printk+0x10/0x10
[ 363.158689][T11829] ? validate_chain+0x11e/0x5900
[ 363.163722][T11829] should_fail_ex+0x3b0/0x4e0
[ 363.168410][T11829] should_failslab+0x9/0x20
[ 363.172915][T11829] __kmalloc_node_noprof+0xdf/0x440
[ 363.178111][T11829] ? __kvmalloc_node_noprof+0x72/0x190
[ 363.183574][T11829] __kvmalloc_node_noprof+0x72/0x190
[ 363.188869][T11829] __kvm_mmu_topup_memory_cache+0x4d9/0x6b0
[ 363.194801][T11829] kvm_mmu_load+0x115/0x26e0
[ 363.199416][T11829] ? __asan_memset+0x23/0x50
[ 363.204038][T11829] ? vmx_vcpu_pi_load+0x13b/0x8c0
[ 363.209381][T11829] ? __pfx_kvm_mmu_load+0x10/0x10
[ 363.214447][T11829] ? __lock_acquire+0x137a/0x2040
[ 363.219511][T11829] kvm_arch_vcpu_pre_fault_memory+0x4c0/0x5a0
[ 363.225862][T11829] ? __pfx_kvm_arch_vcpu_load+0x10/0x10
[ 363.231674][T11829] ? __pfx_kvm_arch_vcpu_pre_fault_memory+0x10/0x10
[ 363.238267][T11829] ? __pfx_lock_release+0x10/0x10
[ 363.243305][T11829] kvm_vcpu_pre_fault_memory+0x24c/0x4b0
[ 363.248941][T11829] ? kvm_vcpu_pre_fault_memory+0x16a/0x4b0
[ 363.254748][T11829] kvm_vcpu_ioctl+0xa47/0xea0
[ 363.259422][T11829] ? __lock_acquire+0x137a/0x2040
[ 363.264461][T11829] ? __pfx_kvm_vcpu_ioctl+0x10/0x10
[ 363.269663][T11829] ? __pfx_tomoyo_path_number_perm+0x10/0x10
[ 363.275672][T11829] kvm_vcpu_compat_ioctl+0x242/0x450
[ 363.280962][T11829] ? __pfx_kvm_vcpu_compat_ioctl+0x10/0x10
[ 363.286780][T11829] ? __fget_files+0x3f6/0x470
[ 363.291474][T11829] ? bpf_lsm_file_ioctl_compat+0x9/0x10
[ 363.297023][T11829] ? security_file_ioctl_compat+0x87/0xb0
[ 363.302765][T11829] __se_compat_sys_ioctl+0x51c/0xca0
[ 363.308053][T11829] ? __pfx___se_compat_sys_ioctl+0x10/0x10
[ 363.313859][T11829] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 363.319981][T11829] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 363.326322][T11829] ? syscall_enter_from_user_mode_prepare+0x7f/0xe0
[ 363.332919][T11829] ? lockdep_hardirqs_on+0x99/0x150
[ 363.338116][T11829] __do_fast_syscall_32+0xb4/0x110
[ 363.343221][T11829] ? exc_page_fault+0x590/0x8c0
[ 363.348095][T11829] do_fast_syscall_32+0x34/0x80
[ 363.352937][T11829] entry_SYSENTER_compat_after_hwframe+0x84/0x8e