Re: [PATCH 2/6] KVM: x86/mmu: Properly account NX huge page workaround for nonpaging MMUs
From: Mingwei Zhang
Date: Mon Apr 11 2022 - 18:05:25 EST
On Mon, Apr 11, 2022 at 11:33 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Mon, Apr 11, 2022, Mingwei Zhang wrote:
> > On Sat, Apr 09, 2022, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > > index 671cfeccf04e..89df062d5921 100644
> > > --- a/arch/x86/kvm/mmu.h
> > > +++ b/arch/x86/kvm/mmu.h
> > > @@ -191,6 +191,15 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> > > .user = err & PFERR_USER_MASK,
> > > .prefetch = prefetch,
> > > .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
> > > +
> > > + /*
> > > + * Note, enforcing the NX huge page mitigation for nonpaging
> > > + * MMUs (shadow paging, CR0.PG=0 in the guest) is completely
> > > + * unnecessary. The guest doesn't have any page tables to
> > > + * abuse and is guaranteed to switch to a different MMU when
> > > + * CR0.PG is toggled on (may not always be guaranteed when KVM
> > > + * is using TDP). See make_spte() for details.
> > > + */
> > > .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(),
> >
> > hmm. I think there could be a minor issue here (even in original code).
> > The nx_huge_page_workaround_enabled is attached here with page fault.
> > However, at the time of make_spte(), we call is_nx_huge_page_enabled()
> > again. Since this function will directly check the module parameter,
> > there might be a race condition here. eg., at the time of page fault,
> > the workround was 'true', while by the time we reach make_spte(), the
> > parameter was set to 'false'.
>
> Toggling the mitigation invalidates and zaps all roots. Any page fault acquires
> mmu_lock after the toggling is guaranteed to see the correct value, any page fault
> that completed before kvm_mmu_zap_all_fast() is guaranteed to be zapped.
hmm. ok.
>
> > I have not figured out what the side effect is. But I feel like the
> > make_spte() should just follow the information in kvm_page_fault instead
> > of directly querying the global config.
>
> I started down this exact path :-) The problem is that, even without Ben's series,
> KVM uses make_spte() for things other than page faults.
Reviewed-by: Mingwei Zhang <mizhang@xxxxxxxxxx>