Re: [PATCH v3 2/5] KVM: MMU: check guest CR3 reserved bits based on its physical address width.

From: Yu Zhang
Date: Mon Sep 18 2017 - 04:38:47 EST




On 9/16/2017 7:19 AM, Jim Mattson wrote:
On Thu, Aug 24, 2017 at 5:27 AM, Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx> wrote:
Currently, KVM uses CR3_L_MODE_RESERVED_BITS to check the
reserved bits in CR3. Yet the length of reserved bits in
guest CR3 should be based on the physical address width
exposed to the VM. This patch changes CR3 check logic to
calculate the reserved bits at runtime.

Signed-off-by: Yu Zhang <yu.c.zhang@xxxxxxxxxxxxxxx>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/emulate.c | 14 ++++++++++++--
arch/x86/kvm/mmu.h | 3 +++
arch/x86/kvm/x86.c | 8 ++++----
4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6db0ed9..e716228 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -79,7 +79,6 @@
| X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
| X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))

-#define CR3_L_MODE_RESERVED_BITS 0xFFFFFF0000000000ULL
#define CR3_PCID_INVD BIT_64(63)
#define CR4_RESERVED_BITS \
(~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 319d91f..a89b595 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -28,6 +28,7 @@

#include "x86.h"
#include "tss.h"
+#include "mmu.h"

/*
* Operand types
@@ -4097,8 +4098,17 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
u64 rsvd = 0;

ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
- if (efer & EFER_LMA)
- rsvd = CR3_L_MODE_RESERVED_BITS & ~CR3_PCID_INVD;
+ if (efer & EFER_LMA) {
+ u64 maxphyaddr;
+ u32 eax = 0x80000008;
+
+ if (ctxt->ops->get_cpuid(ctxt, &eax, NULL, NULL,
+ NULL, false))
Passing NULL for the address of ecx looks problematic to me.

We have:

static bool emulator_get_cpuid(struct x86_emulate_ctxt *ctxt,
u32 *eax, u32 *ebx, u32 *ecx, u32 *edx, bool
check_limit)
{
return kvm_cpuid(emul_to_vcpu(ctxt), eax, ebx, ecx, edx, check_limit);
}

And:

bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
u32 *ecx, u32 *edx, bool check_limit)
{
u32 function = *eax, index = *ecx;
struct kvm_cpuid_entry2 *best;
bool entry_found = true;
...

Doesn't this immediately try to dereference a NULL pointer? How much
testing have you done of this code?

Thanks Jim.
I have tested this code in a simulator to successfully boot a VM in shadow mode.
Seems this code is not covered(but I am now still perplexed why this is not covered).
Any possibility that the check_cr_write() is not triggered when emulating the cr
operations?

Anyway, this should be a bug and thanks for pointing this out, and I'll send out the
fix later.

BR
Yu