Re: [PATCH 1/2] KVM: MMU: fix ept=0/pte.u=0/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo

From: Xiao Guangrong
Date: Thu Mar 10 2016 - 07:15:07 EST




On 03/10/2016 06:09 PM, Paolo Bonzini wrote:


On 10/03/2016 09:27, Xiao Guangrong wrote:


+ if (!enable_ept) {
+ guest_efer |= EFER_NX;
+ ignore_bits |= EFER_NX;

Update ignore_bits is not necessary i think.

More precisely, ignore_bits is only needed if guest EFER.NX=0 and we're
not in this CR0.WP=1/CR4.SMEP=0 situation. In theory you could have
guest EFER.NX=1 and host EFER.NX=0.

It is not in linux, the kernel always set EFER.NX if CPUID reports it,
arch/x86/kernel/head_64.S:

204 /* Setup EFER (Extended Feature Enable Register) */
205 movl $MSR_EFER, %ecx
206 rdmsr
207 btsl $_EFER_SCE, %eax /* Enable System Call */
208 btl $20,%edi /* No Execute supported? */
209 jnc 1f
210 btsl $_EFER_NX, %eax
211 btsq $_PAGE_BIT_NX,early_pmd_flags(%rip)
212 1: wrmsr /* Make changes effective */

So if guest sees NX in its cpuid then host EFER.NX should be 1.


This is what I came up with (plus some comments :)):

u64 guest_efer = vmx->vcpu.arch.efer;
u64 ignore_bits = 0;

if (!enable_ept) {
if (boot_cpu_has(X86_FEATURE_SMEP))
guest_efer |= EFER_NX;
else if (!(guest_efer & EFER_NX))
ignore_bits |= EFER_NX;
}

Your logic is very right.

What my suggestion is we can keep ignore_bits = EFER_NX | EFER_SCE;
(needn't conditionally adjust it) because EFER_NX must be the same
between guest and host if we switch EFER manually.

My patch is bigger but the resulting code is smaller and easier to follow:

guest_efer = vmx->vcpu.arch.efer;
if (!enable_ept)
guest_efer |= EFER_NX;
...
if (...) {
...
} else {
guest_efer &= ~ignore_bits;
guest_efer |= host_efer & ignore_bits;
}

I agreed. :)