Re: [PATCH 1/5] KVM: x86/mmu: Don't attempt to map leaf if target TDP MMU SPTE is frozen

From: Robert Hoo
Date: Wed Dec 14 2022 - 06:58:36 EST


On Tue, 2022-12-13 at 03:30 +0000, Sean Christopherson wrote:
> Hoist the is_removed_spte() check above the "level == goal_level"
> check
> when walking SPTEs during a TDP MMU page fault to avoid attempting to
> map
> a leaf entry if said entry is frozen by a different task/vCPU.
>
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 939 at arch/x86/kvm/mmu/tdp_mmu.c:653
> kvm_tdp_mmu_map+0x269/0x4b0
> Modules linked in: kvm_intel
> CPU: 3 PID: 939 Comm: nx_huge_pages_t Not tainted 6.1.0-rc4+ #67
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0
> 02/06/2015
> RIP: 0010:kvm_tdp_mmu_map+0x269/0x4b0
> RSP: 0018:ffffc9000068fba8 EFLAGS: 00010246
> RAX: 00000000000005a0 RBX: ffffc9000068fcc0 RCX: 0000000000000005
> RDX: ffff88810741f000 RSI: ffff888107f04600 RDI: ffffc900006a3000
> RBP: 060000010b000bf3 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 000ffffffffff000 R12: 0000000000000005
> R13: ffff888113670000 R14: ffff888107464958 R15: 0000000000000000
> FS: 00007f01c942c740(0000) GS:ffff888277cc0000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000000117013006 CR4: 0000000000172ea0
> Call Trace:
> <TASK>
> kvm_tdp_page_fault+0x10c/0x130
> kvm_mmu_page_fault+0x103/0x680
> vmx_handle_exit+0x132/0x5a0 [kvm_intel]
> vcpu_enter_guest+0x60c/0x16f0
> kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
> kvm_vcpu_ioctl+0x271/0x660
> __x64_sys_ioctl+0x80/0xb0
> do_syscall_64+0x2b/0x50
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
> </TASK>
> ---[ end trace 0000000000000000 ]---
>
> Fixes: 63d28a25e04c ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow
> when guest has to retry")
> Cc: Robert Hoo <robert.hu@xxxxxxxxxxxxxxx>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 764f7c87286f..b740f38fedcc 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1162,9 +1162,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu,
> struct kvm_page_fault *fault)
> if (fault->nx_huge_page_workaround_enabled)
> disallowed_hugepage_adjust(fault,
> iter.old_spte, iter.level);
>
> - if (iter.level == fault->goal_level)
> - break;
> -
> /*
> * If SPTE has been frozen by another thread, just give
> up and
> * retry, avoiding unnecessary page table allocation
> and free.
> @@ -1172,6 +1169,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu,
> struct kvm_page_fault *fault)
> if (is_removed_spte(iter.old_spte))
> goto retry;
>
> + if (iter.level == fault->goal_level)
> + break;
> +
> /* Step down into the lower level page table if it
> exists. */
> if (is_shadow_present_pte(iter.old_spte) &&
> !is_large_pte(iter.old_spte))

Reviewed-by: Robert Hoo <robert.hu@xxxxxxxxxxxxxxx>