[PATCH] x86/kvm/mmu: make mmu->prev_roots cache work for NPT case

From: Vitaly Kuznetsov
Date: Fri Feb 22 2019 - 11:46:22 EST


I noticed that fast_cr3_switch() always fails when we switch back from L2
to L1 as it is not able to find a cached root. This is odd: host's CR3
usually stays the same, we expect to always follow the fast path. Turns
out the problem is that page role is always mismatched because
kvm_mmu_get_page() filters out cr4_pae when direct, the value is stored
in page header and later compared with new_role in cached_root_available().
As cr4_pae is always set in long mode prev_roots cache is dysfunctional.

The problem appeared after we introduced kvm_calc_mmu_role_common():
previously, we were only setting this bit for shadow MMU root but now
we set it for everything. Restore the original behavior.

Fixes: 7dcd57552008 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed")
Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
RFC:
Alternatively, I can suggest two solutions:
- Do not clear cr4_pae in kvm_mmu_get_page() and check direct on call sites
(detect_write_misaligned(), get_written_sptes()).
- Filter cr4_pae out when direct in kvm_mmu_new_cr3().
The code in kvm_mmu_get_page() is very ancient, I'm afraid to touch it :=)
---
arch/x86/kvm/mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f2d1d230d5b8..c729e98eee49 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4791,7 +4791,6 @@ static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,

role.base.access = ACC_ALL;
role.base.nxe = !!is_nx(vcpu);
- role.base.cr4_pae = !!is_pae(vcpu);
role.base.cr0_wp = is_write_protection(vcpu);
role.base.smm = is_smm(vcpu);
role.base.guest_mode = is_guest_mode(vcpu);
@@ -4871,6 +4870,8 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu, bool base_only)
{
union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, base_only);

+ role.base.cr4_pae = !!is_pae(vcpu);
+
role.base.smep_andnot_wp = role.ext.cr4_smep &&
!is_write_protection(vcpu);
role.base.smap_andnot_wp = role.ext.cr4_smap &&
--
2.20.1