[PATCH 6/6] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust()

From: Sean Christopherson
Date: Fri Apr 08 2022 - 20:39:33 EST


From: Mingwei Zhang <mizhang@xxxxxxxxxx>

Explicitly check if a NX huge page is disallowed when determining if a page
fault needs to be forced to use a smaller sized page. KVM incorrectly
assumes that the NX huge page mitigation is the only scenario where KVM
will create a shadow page instead of a huge page. Any scenario that causes
KVM to zap leaf SPTEs may result in having a SP that can be made huge
without violating the NX huge page mitigation. E.g. disabling of dirty
logging, zapping from mmu_notifier due to page migration, guest MTRR
changes that affect the viability of a huge page, etc...

Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
[sean: add barrier comments, use spte_to_sp()]
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++++--
arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8b4f3550710a..c6f018c6d2f5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2908,6 +2908,19 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
cur_level == fault->goal_level &&
is_shadow_present_pte(spte) &&
!is_large_pte(spte)) {
+ u64 page_mask;
+
+ /*
+ * Ensure nx_huge_page_disallowed is read after checking for a
+ * present shadow page. A different vCPU may be concurrently
+ * installing the shadow page if mmu_lock is held for read.
+ * Pairs with the smp_wmb() in kvm_tdp_mmu_map().
+ */
+ smp_rmb();
+
+ if (!spte_to_sp(spte)->nx_huge_page_disallowed)
+ return;
+
/*
* A small SPTE exists for this pfn, but FNAME(fetch)
* and __direct_map would like to create a large PTE
@@ -2915,8 +2928,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
* patching back for them into pfn the next 9 bits of
* the address.
*/
- u64 page_mask = KVM_PAGES_PER_HPAGE(cur_level) -
- KVM_PAGES_PER_HPAGE(cur_level - 1);
+ page_mask = KVM_PAGES_PER_HPAGE(cur_level) -
+ KVM_PAGES_PER_HPAGE(cur_level - 1);
fault->pfn |= fault->gfn & page_mask;
fault->goal_level--;
}
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index d0e6b341652c..5cae5cdcfcbc 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1185,6 +1185,12 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
tdp_mmu_init_child_sp(sp, &iter);

sp->nx_huge_page_disallowed = fault->huge_page_disallowed;
+ /*
+ * Ensure nx_huge_page_disallowed is visible before the
+ * SP is marked present, as mmu_lock is held for read.
+ * Pairs with the smp_rmb() in disallowed_hugepage_adjust().
+ */
+ smp_wmb();

if (tdp_mmu_link_sp(kvm, &iter, sp, true)) {
tdp_mmu_free_sp(sp);
--
2.35.1.1178.g4f1659d476-goog