Re: [PATCH 10/54] KVM: x86/mmu: Replace EPT shadow page shenanigans with simpler check

From: Sean Christopherson
Date: Wed Jun 23 2021 - 12:17:20 EST


On Wed, Jun 23, 2021, Paolo Bonzini wrote:
> On 22/06/21 19:56, Sean Christopherson wrote:
> > Replace the hack to identify nested EPT shadow pages with a simple check
> > that the size of the guest PTEs associated with the shadow page and the
> > current MMU match, which is the intent of the "8 bytes == PAE" test.
> > The nested EPT hack existed to avoid a false negative due to the is_pae()
> > check not matching for 32-bit L2 guests; checking the MMU role directly
> > avoids the indirect calculation of the guest PTE size entirely.
>
> What the commit message doesn't say is, did we miss this opportunity all
> along, or has there been a change since commit 47c42e6b4192 ("KVM: x86:
> fix handling of role.cr4_pae and rename it to 'gpte_size'", 2019-03-28)
> that allows this?

The code was wrong from the initial "unsync" commit. The 4-byte vs. 8-byte check
papered over the real bug, which was that the roles were not checked for
compabitility. I suspect that the bug only manisfested as an observable problem
when the GPTE sizes mismatched, thus the PAE check was added.

So yes, there was an "opportunity" that was missed all along.

> I think the only change needed would be making the commit something like
> this:
>
> ==========
> KVM: x86/mmu: Use MMU role to check for matching guest page sizes
>
> Originally, __kvm_sync_page used to check the cr4_pae bit in the role
> to avoid zapping 4-byte kvm_mmu_pages when guest page size are 8-byte
> or the other way round. However, in commit 47c42e6b4192 ("KVM: x86: fix
> handling of role.cr4_pae and rename it to 'gpte_size'", 2019-03-28) it
> was observed that this did not work for nested EPT, where the page table
> size would be 8 bytes even if CR4.PAE=0. (Note that the check still
> has to be done for nested *NPT*, so it is not possible to use tdp_enabled
> or similar).
>
> Therefore, a hack was introduced to identify nested EPT shadow pages
> and unconditionally call __kvm_sync_page() on them. However, it is
> possible to do without the hack to identify nested EPT shadow pages:
> if EPT is active, there will be no shadow pages in non-EPT format,
> and all of them will have gpte_is_8_bytes set to true; we can just
> check the MMU role directly, and the test will always be true.
>
> Even for non-EPT shadow MMUs, this test should really always be true
> now that __kvm_sync_page() is called if and only if the role is an
> exact match (kvm_mmu_get_page()) or is part of the current MMU context
> (kvm_mmu_sync_roots()). A future commit will convert the likely-pointless
> check into a meaningful WARN to enforce that the mmu_roles of the current
> context and the shadow page are compatible.
> ==========
>
>
> Paolo
>
> > Note, this should be a glorified nop now that __kvm_sync_page() is called
> > if and only if the role is an exact match (kvm_mmu_get_page()) or is part
> > of the current MMU context (kvm_mmu_sync_roots()). A future commit will
> > convert the likely-pointless check into a meaningful WARN to enforce that
> > the mmu_roles of the current context and the shadow page are compatible.
> >
> > Cc: Vitaly Kuznetsov<vkuznets@xxxxxxxxxx>
> > Signed-off-by: Sean Christopherson<seanjc@xxxxxxxxxx>
>