[PATCH RFC 0/7] x86/kvm/nVMX: optimize MMU switch between L1 and L2

From: Vitaly Kuznetsov
Date: Fri Jul 20 2018 - 09:26:32 EST


Currently, when we switch from L1 to L2 we do the following:
- Re-initialize L1 MMU as shadow EPT MMU (nested_ept_init_mmu_context())
- Re-initialize 'nested' MMU (nested_vmx_load_cr3() -> init_kvm_nested_mmu())
- Reload MMU root upon guest entry.

When we switch back we do:
- Re-initialize L1 MMU (nested_vmx_load_cr3() -> init_kvm_tdp_mmu())
- Reload MMU root upon guest entry.

This seems to be sub-optimal. Initializing MMU is expensive (thanks to
update_permission_bitmask(), update_pkru_bitmask(),..) and reloading MMU
root doesn't come for free.

Try to approach the issue by splitting L1-normal and L1-nested MMUs and
checking if MMU reset is really needed. This spares us about 1000 cpu
cycles on nested vmexit.

RFC part:
- Does this look like a plausible solution?
- SVM nested can probably be optimized in the same way.
- Doesn mmu_update_needed() cover everything?

Vitaly Kuznetsov (7):
x86/kvm/mmu: make vcpu->mmu a pointer to the current MMU
x86/kvm/mmu.c: set get_pdptr hook in kvm_init_shadow_ept_mmu()
x86/kvm/mmu.c: add kvm_mmu parameter to kvm_mmu_free_roots()
x86/kvm/mmu: introduce guest_mmu
x86/kvm/mmu: get rid of redundant kvm_mmu_setup()
x86/kvm/nVMX: introduce scache for kvm_init_shadow_ept_mmu
x86/kvm/nVMX: optimize MMU switch from nested_vmx_load_cr3()

arch/x86/include/asm/kvm_host.h | 36 ++++-
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/mmu.c | 282 ++++++++++++++++++++++++++--------------
arch/x86/kvm/mmu.h | 2 +-
arch/x86/kvm/mmu_audit.c | 12 +-
arch/x86/kvm/paging_tmpl.h | 17 +--
arch/x86/kvm/svm.c | 20 +--
arch/x86/kvm/vmx.c | 52 +++++---
arch/x86/kvm/x86.c | 34 ++---
9 files changed, 292 insertions(+), 165 deletions(-)

--
2.14.4