Re: [RFC PATCH 1/2] KVM: MMU: precompute page fault error code

From: Xiao Guangrong
Date: Thu Mar 10 2016 - 09:02:16 EST




On 03/08/2016 07:45 PM, Paolo Bonzini wrote:
For the next patch, we will want to filter PFERR_FETCH_MASK away early,
and not pass it to permission_fault if neither NX nor SMEP are enabled.
Prepare for the change.

Why it is needed? It is much easier to drop PFEC.F in
update_permission_bitmask().


Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
---
arch/x86/kvm/mmu.c | 2 +-
arch/x86/kvm/paging_tmpl.h | 26 +++++++++++++++-----------
2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 2463de0b935c..e57f7be061e3 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3883,7 +3883,7 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu,
u = bit & ACC_USER_MASK;

if (!ept) {
- /* Not really needed: !nx will cause pte.nx to fault */
+ /* Not really needed: if !nx, ff will always be zero */
x |= !mmu->nx;
/* Allow supervisor writes if !cr0.wp */
w |= !is_write_protection(vcpu) && !uf;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 6013f3685ef4..285858d3223b 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -272,13 +272,24 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
gpa_t pte_gpa;
int offset;
const int write_fault = access & PFERR_WRITE_MASK;
- const int user_fault = access & PFERR_USER_MASK;
- const int fetch_fault = access & PFERR_FETCH_MASK;
- u16 errcode = 0;
+ u16 errcode;
gpa_t real_gpa;
gfn_t gfn;

trace_kvm_mmu_pagetable_walk(addr, access);
+
+ /*
+ * Do not modify PFERR_FETCH_MASK in access. It is used later in the call to
+ * mmu->translate_gpa and, when nested virtualization is in use, the X or NX
+ * bit of nested page tables always applies---even if NX and SMEP are disabled
+ * in the guest.
+ *
+ * TODO: cache the result of the NX and SMEP test in struct kvm_mmu?
+ */
+ errcode = access;
+ if (!(mmu->nx || kvm_read_cr4_bits(vcpu, X86_CR4_SMEP)))
+ errcode &= ~PFERR_FETCH_MASK;
+

PFEC.P might be set since it is copied from @access.

And the workload is moved from rare path to the common path...

retry_walk:
walker->level = mmu->root_level;
pte = mmu->get_cr3(vcpu);
@@ -389,9 +400,7 @@ retry_walk:

if (unlikely(!accessed_dirty)) {
ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault);
- if (unlikely(ret < 0))
- goto error;
- else if (ret)
+ if (ret > 0)
goto retry_walk;

So it returns normally even if update_accessed_dirty_bits() failed.