Re: [PATCH 08/12] KVM: MMU: do not consult levels when freeing roots

From: Sean Christopherson
Date: Thu Feb 10 2022 - 21:20:26 EST


On Fri, Feb 11, 2022, Sean Christopherson wrote:
> On Fri, Feb 11, 2022, Sean Christopherson wrote:
> > On Fri, Feb 11, 2022, Paolo Bonzini wrote:
> > > On 2/11/22 01:54, Sean Christopherson wrote:
> > > > > > @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> > > > > > &invalid_list);
> > > > > > if (free_active_root) {
> > > > > > - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
> > > > > > - (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
> > > > > > + if (to_shadow_page(mmu->root.hpa)) {
> > > > > > mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
> > > > > > } else if (mmu->pae_root) {
> > > >
> > > > Gah, this is technically wrong. It shouldn't truly matter, but it's wrong. root.hpa
> > > > will not be backed by shadow page if the root is pml4_root or pml5_root, in which
> > > > case freeing the PAE root is wrong. They should obviously be invalid already, but
> > > > it's a little confusing because KVM wanders down a path that may not be relevant
> > > > to the current mode.
> > >
> > > pml4_root and pml5_root are dummy, and the first "real" level of page tables
> > > is stored in pae_root for that case too, so I think that should DTRT.
> >
> > Ugh, completely forgot that detail. You're correct.

Mostly correct. The first "real" level will be PML4 in the hCR4.LA57=1, gCR4.LA57=0
nested NPT case. Ditto for shadowing PAE NPT with 4/5-level NPT, though in that
case KVM still allocates pae_root entries, it just happens to be a "real" level.

And now I realize why I'm so confused, mmu_alloc_shadow_roots() is also broken
with respect to 5-level shadowing 4-level. I believe the part that got fixed
was 5-level with a 32-bit guest. Ugh.

For the stuff that actually works in KVM, this will do just fine. 5-level nNPT
can be punted to the future.