Re: [PATCH] KVM: nVMX: nested VPID emulation

From: Wanpeng Li
Date: Tue Sep 15 2015 - 06:18:50 EST

Next message: Fu Wei: "Re: [PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts"
Previous message: Fu Wei: "Re: [PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts"
In reply to: Bandan Das: "Re: [PATCH] KVM: nVMX: nested VPID emulation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 9/15/15 12:08 AM, Bandan Das wrote:

Wanpeng Li <wanpeng.li@xxxxxxxxxxx> writes:

VPID is used to tag address space and avoid a TLB flush. Currently L0 use
the same VPID to run L1 and all its guests. KVM flushes VPID when switching
between L1 and L2.

This patch advertises VPID to the L1 hypervisor, then address space of L1 and
L2 can be separately treated and avoid TLB flush when swithing between L1 and
L2. This patch gets ~3x performance improvement for lmbench 8p/64k ctxsw.

TLB flush does context invalidation and while that should result in
some improvement, I never expected a 3x improvement for any workload!
Interesting :)

The result still looks good when test v2.

Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
---
arch/x86/kvm/vmx.c | 39 ++++++++++++++++++++++++++++++++-------
1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index da1590e..06bc31e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1157,6 +1157,11 @@ static inline bool nested_cpu_has_virt_x2apic_mode(struct vmcs12 *vmcs12)
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
}
+static inline bool nested_cpu_has_vpid(struct vmcs12 *vmcs12)
+{
+ return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_VPID);
+}
+
static inline bool nested_cpu_has_apic_reg_virt(struct vmcs12 *vmcs12)
{
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_APIC_REGISTER_VIRT);
@@ -2471,6 +2476,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
SECONDARY_EXEC_RDTSCP |
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+ SECONDARY_EXEC_ENABLE_VPID |
SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_WBINVD_EXITING |
@@ -4160,7 +4166,7 @@ static void allocate_vpid(struct vcpu_vmx *vmx)
int vpid;
vmx->vpid = 0;
- if (!enable_vpid)
+ if (!enable_vpid || is_guest_mode(&vmx->vcpu))
return;
spin_lock(&vmx_vpid_lock);
vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS);
@@ -6738,6 +6744,14 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
}
vmcs12 = kmap(page);
vmcs12->launch_state = 0;
+ if (enable_vpid) {
+ if (nested_cpu_has_vpid(vmcs12)) {
+ spin_lock(&vmx_vpid_lock);
+ if (vmcs12->virtual_processor_id != 0)
+ __clear_bit(vmcs12->virtual_processor_id, vmx_vpid_bitmap);
+ spin_unlock(&vmx_vpid_lock);
+ }
+ }
kunmap(page);
nested_release_page(page);

I don't think this is enough, we should also check for set "nested" bits
in free_vpid() and clear them. There should be some sort of a mapping between the
nested guest vpid and the actual vpid so that we can just clear those bits.

Agreed.

@@ -9189,6 +9203,7 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 exec_control;
+ int vpid;
vmcs_write16(GUEST_ES_SELECTOR, vmcs12->guest_es_selector);
vmcs_write16(GUEST_CS_SELECTOR, vmcs12->guest_cs_selector);
@@ -9438,13 +9453,21 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
else
vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset);
+
Empty space here.

if (enable_vpid) {
- /*
- * Trivially support vpid by letting L2s share their parent
- * L1's vpid. TODO: move to a more elaborate solution, giving
- * each L2 its own vpid and exposing the vpid feature to L1.
- */
- vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
+ if (nested_cpu_has_vpid(vmcs12)) {
+ if (vmcs12->virtual_processor_id == 0) {

Ok, so if we advertise vpid to the nested hypervisor, isn't it going to
attempt writing this field when setting up ? Atleast
that's what Linux does, no ?

Agreed, I do the allocation of vpid02 during initialization in v2.

+ spin_lock(&vmx_vpid_lock);
+ vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS);
+ if (vpid < VMX_NR_VPIDS)
+ __set_bit(vpid, vmx_vpid_bitmap);
+ spin_unlock(&vmx_vpid_lock);
+ vmcs_write16(VIRTUAL_PROCESSOR_ID, vpid);
+ } else
+ vmcs_write16(VIRTUAL_PROCESSOR_ID, vmcs12->virtual_processor_id);
+ } else
+ vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
+

I guess L1 shouldn't know what vpid L0 chose to run L2. If L1 vmreads,
it should get what it expects for the value of vpid, not the one L0 chose.

Agreed.

vmx_flush_tlb(vcpu);
}

So, this isn't removed ? I thought it's not needed anymore ?

Please review v2. :-)

Regards,
Wanpeng Li

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Fu Wei: "Re: [PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts"
Previous message: Fu Wei: "Re: [PATCH v7 2/8] ARM64: add SBSA Generic Watchdog device node in foundation-v8.dts"
In reply to: Bandan Das: "Re: [PATCH] KVM: nVMX: nested VPID emulation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]