Re: [PATCH 06/43] KVM: x86: Properly reset MMU context at vCPU RESET/INIT
From: Reiji Watanabe
Date: Mon May 24 2021 - 00:58:30 EST
On Wed, May 19, 2021 at 10:16 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Tue, May 18, 2021, Reiji Watanabe wrote:
> > > > > + if (kvm_cr0_mmu_role_changed(old_cr0, kvm_read_cr0(vcpu)) ||
> > > > > + kvm_cr4_mmu_role_changed(old_cr4, kvm_read_cr4(vcpu)))
> > > > > + kvm_mmu_reset_context(vcpu);
> > > > > }
> > > >
> > > > I'm wondering if kvm_vcpu_reset() should call kvm_mmu_reset_context()
> > > > for a change in EFER.NX as well.
> > >
> > > Oooh. So there _should_ be no need. Paging has to be enabled for EFER.NX to
> > > be relevant, and INIT toggles CR0.PG 1=>0 if paging was enabled and so is
> > > guaranteed to trigger a context reset. And we do want to skip the context reset,
> > > e.g. INIT-SIPI-SIPI when the vCPU has paging disabled should continue using the
> > > same MMU.
> > >
> > > But, kvm_calc_mmu_role_common() neglects to ignore NX if CR0.PG=0, and so the
> > > MMU role will be stale if INIT clears EFER.NX without forcing a context reset.
> > > However, that's benign from a functionality perspective because the context
> > > itself correctly incorporates CR0.PG, it's only the role that's borked. I.e.
> > > KVM will fail to reuse a page/context due to the spurious role.nxe, but the
> > > permission checks are always be correct.
> > >
> > > I'll add a comment here and send a patch to fix the role calculation.
> >
> > Thank you so much for the explanation !
> > I understand your intention and why it would be benign.
> >
> > Then, I'm wondering if kvm_cr4_mmu_role_changed() needs to be
> > called here. Looking at the Intel SDM, in my understanding,
> > all the bits kvm_cr4_mmu_role_changed() checks are relevant
> > only if paging is enabled. (Or is my understanding incorrect ??)
>
> Duh, yes. And it goes even beyond that, CR0.WP is only relevant if CR0.PG=1,
> i.e. INIT with CR0.PG=0 and CR0.WP=1 will incorrectly trigger a MMU reset with
> the current logic.
>
> Sadly, simply omitting the CR4 check puts us in an awkward situation where, due
> to the MMU role CR4 calculations not accounting for CR0.PG=0, KVM will run with
> a stale role.
>
> The other consideration is that kvm_post_set_cr4() and kvm_post_set_cr0() should
> also skip kvm_mmu_reset_context() if CR0.PG=0, but again that requires fixing
> the role calculations first (or at the same time).
>
> I think I'll throw in those cleanups to the beginning of this series. The result
> is going to be disgustingly long, but I really don't want to introduce code that
> knowingly leaves KVM in an inconsistent state, nor do I want to add useless
> checks on CR4 and EFER.
Yes, I would think having the cleanups would be better.
Thank you !
Reiji