Re: [PATCH v2 07/17] KVM: x86/mmu: Check PDPTRs before allocating PAE roots
From: Sean Christopherson
Date: Thu Apr 08 2021 - 11:48:39 EST
On Thu, Apr 08, 2021, Paolo Bonzini wrote:
> On 08/04/21 13:15, Wanpeng Li wrote:
> > I saw this splatting:
> >
> > BUG: sleeping function called from invalid context at
> > arch/x86/kvm/kvm_cache_regs.h:115
> > kvm_pdptr_read+0x20/0x60 [kvm]
> > kvm_mmu_load+0x3bd/0x540 [kvm]
> >
> > There is a might_sleep() in kvm_pdptr_read(), however, the original
> > commit didn't explain more. I can send a formal one if the below fix
> > is acceptable.
We don't want to drop mmu_lock, even temporarily. The reason for holding it
across the entire sequence is to ensure kvm_mmu_available_pages() isn't violated.
> I think we can just push make_mmu_pages_available down into
> kvm_mmu_load's callees. This way it's not necessary to hold the lock
> until after the PDPTR check:
...
> @@ -4852,14 +4868,10 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
> r = mmu_alloc_special_roots(vcpu);
> if (r)
> goto out;
> - write_lock(&vcpu->kvm->mmu_lock);
> - if (make_mmu_pages_available(vcpu))
> - r = -ENOSPC;
> - else if (vcpu->arch.mmu->direct_map)
> + if (vcpu->arch.mmu->direct_map)
> r = mmu_alloc_direct_roots(vcpu);
> else
> r = mmu_alloc_shadow_roots(vcpu);
> - write_unlock(&vcpu->kvm->mmu_lock);
> if (r)
> goto out;
Freaking PDPTRs. I was really hoping we could keep the lock and pages_available()
logic outside of the helpers. What if kvm_mmu_load() reads the PDPTRs and
passes them into mmu_alloc_shadow_roots()? Or is that too ugly?
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index efb41f31e80a..e3c4938cd665 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3275,11 +3275,11 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
return 0;
}
-static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
+static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu, u64 pdptrs[4])
{
struct kvm_mmu *mmu = vcpu->arch.mmu;
- u64 pdptrs[4], pm_mask;
gfn_t root_gfn, root_pgd;
+ u64 pm_mask;
hpa_t root;
int i;
@@ -3291,11 +3291,8 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
if (mmu->root_level == PT32E_ROOT_LEVEL) {
for (i = 0; i < 4; ++i) {
- pdptrs[i] = mmu->get_pdptr(vcpu, i);
- if (!(pdptrs[i] & PT_PRESENT_MASK))
- continue;
-
- if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT))
+ if ((pdptrs[i] & PT_PRESENT_MASK) &&
+ mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT))
return 1;
}
}
@@ -4844,21 +4841,33 @@ EXPORT_SYMBOL_GPL(kvm_mmu_reset_context);
int kvm_mmu_load(struct kvm_vcpu *vcpu)
{
- int r;
+ struct kvm_mmu *mmu = vcpu->arch.mmu;
+ u64 pdptrs[4];
+ int r, i;
- r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->direct_map);
+ r = mmu_topup_memory_caches(vcpu, !mmu->direct_map);
if (r)
goto out;
r = mmu_alloc_special_roots(vcpu);
if (r)
goto out;
+
+ /*
+ * On SVM, reading PDPTRs might access guest memory, which might fault
+ * and thus might sleep. Grab the PDPTRs before acquiring mmu_lock.
+ */
+ if (!mmu->direct_map && mmu->root_level == PT32E_ROOT_LEVEL) {
+ for (i = 0; i < 4; ++i)
+ pdptrs[i] = mmu->get_pdptr(vcpu, i);
+ }
+
write_lock(&vcpu->kvm->mmu_lock);
if (make_mmu_pages_available(vcpu))
r = -ENOSPC;
else if (vcpu->arch.mmu->direct_map)
r = mmu_alloc_direct_roots(vcpu);
else
- r = mmu_alloc_shadow_roots(vcpu);
+ r = mmu_alloc_shadow_roots(vcpu, pdptrs);
write_unlock(&vcpu->kvm->mmu_lock);
if (r)
goto out;