Re: [PATCH 21/24] KVM: x86: always update CR3 in VMCB

From: Sean Christopherson
Date: Wed May 20 2020 - 14:22:06 EST


On Wed, May 20, 2020 at 01:21:42PM -0400, Paolo Bonzini wrote:
> vmx_load_mmu_pgd is delaying the write of GUEST_CR3 to prepare_vmcs02 as
> an optimization, but this is only correct before the nested vmentry.
> If userspace is modifying CR3 with KVM_SET_SREGS after the VM has
> already been put in guest mode, the value of CR3 will not be updated.
> Remove the optimization, which almost never triggers anyway.
>
> This also applies to SVM, where the code was added in commit 689f3bf21628
> ("KVM: x86: unify callbacks to load paging root", 2020-03-16) just to keep the
> two vendor-specific modules closer.
>
> Fixes: 04f11ef45810 ("KVM: nVMX: Always write vmcs02.GUEST_CR3 during nested VM-Enter")
> Fixes: 689f3bf21628 ("KVM: x86: unify callbacks to load paging root")
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---

...

> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 55712dd86baf..7daf6a50e774 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3085,10 +3085,7 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
> spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
> }
>
> - /* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */
> - if (is_guest_mode(vcpu))
> - update_guest_cr3 = false;
> - else if (!enable_unrestricted_guest && !is_paging(vcpu))
> + if (!enable_unrestricted_guest && !is_paging(vcpu))
> guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
> else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))

As an alternative fix, what about marking VCPU_EXREG_CR3 dirty in
__set_sregs()? E.g.

/*
* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter, but
* it can be explicitly dirtied by KVM_SET_SREGS.
*/
if (is_guest_mode(vcpu) &&
!test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_dirty))

There's already a dependency on __set_sregs() doing
kvm_register_mark_available() before kvm_mmu_reset_context(), i.e. the
code is already a bit kludgy. The dirty check would make the kludge less
subtle and provide explicit documentation.

> guest_cr3 = vcpu->arch.cr3;

The comment that's just below the context is now stale, e.g. replace
vmcs01.GUEST_CR3 with vmcs.GUEST_CR3.

> --
> 2.18.2
>
>