Re: [PATCH 4/6] kvm: nVMX: support EPT accessed/dirty bits
From: Radim KrÄmÃÅ
Date: Fri Mar 31 2017 - 12:24:42 EST
2017-03-30 11:55+0200, Paolo Bonzini:
> Now use bit 6 of EPTP to optionally enable A/D bits for EPTP. Another
> thing to change is that, when EPT accessed and dirty bits are not in use,
> VMX treats accesses to guest paging structures as data reads. When they
> are in use (bit 6 of EPTP is set), they are treated as writes and the
> corresponding EPT dirty bit is set. The MMU didn't know this detail,
> so this patch adds it.
>
> We also have to fix up the exit qualification. It may be wrong because
> KVM sets bit 6 but the guest might not.
>
> L1 emulates EPT A/D bits using write permissions, so in principle it may
> be possible for EPT A/D bits to be used by L1 even though not available
> in hardware. The problem is that guest page-table walks will be treated
> as reads rather than writes, so they would not cause an EPT violation.
>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> @@ -319,6 +310,14 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
> ASSERT(!(is_long_mode(vcpu) && !is_pae(vcpu)));
>
> accessed_dirty = have_ad ? PT_GUEST_ACCESSED_MASK : 0;
> +
> + /*
> + * FIXME: on Intel processors, loads of the PDPTE registers for PAE paging
> + * by the MOV to CR instruction are treated as reads and do not cause the
> + * processor to set the dirty flag in tany EPT paging-structure entry.
^
typo
> + */
> + nested_access = (have_ad ? PFERR_WRITE_MASK : 0) | PFERR_USER_MASK;
> +
This special case should be fairly safe if I understand the consequences
correctly,
Reviewed-by: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> @@ -6211,6 +6213,18 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
> + if (is_guest_mode(vcpu)
> + && !(exit_qualification & EPT_VIOLATION_GVA_TRANSLATED)) {
> + /*
> + * Fix up exit_qualification according to whether guest
> + * page table accesses are reads or writes.
> + */
> + u64 eptp = nested_ept_get_cr3(vcpu);
> + exit_qualification &= ~EPT_VIOLATION_ACC_WRITE;
> + if (eptp & VMX_EPT_AD_ENABLE_BIT)
> + exit_qualification |= EPT_VIOLATION_ACC_WRITE;
I think this would be better without unconditional clearing
if (!(eptp & VMX_EPT_AD_ENABLE_BIT))
exit_qualification &= ~EPT_VIOLATION_ACC_WRITE;