[PATCH v3 4/6] KVM: X86: Implement PV IPIs send hypercall

From: Wanpeng Li
Date: Tue Jul 03 2018 - 02:21:52 EST


From: Wanpeng Li <wanpengli@xxxxxxxxxxx>

Using hypercall to send IPIs by one vmexit instead of one by one for
xAPIC/x2APIC physical mode and one vmexit per-cluster for x2APIC cluster
mode. Intel guest can enter x2apic cluster mode when interrupt remmaping
is enabled in qemu, however, latest AMD EPYC still just supports xapic
mode which can get great improvement by PV IPIs.

Hardware: Xeon Skylake 2.5GHz, 2 sockets, 40 cores, 80 threads, the VM
is 80 vCPUs, IPI microbenchmark(https://lkml.org/lkml/2017/12/19/141):

x2apic cluster mode, vanilla

Dry-run: 0, 2392199 ns
Self-IPI: 6907514, 15027589 ns
Normal IPI: 223910476, 251301666 ns
Broadcast IPI: 0, 9282161150 ns
Broadcast lock: 0, 8812934104 ns

x2apic cluster mode, pv-ipi

Dry-run: 0, 2449341 ns
Self-IPI: 6720360, 15028732 ns
Normal IPI: 228643307, 255708477 ns
Broadcast IPI: 0, 7572293590 ns => 22% performance boost
Broadcast lock: 0, 8316124651 ns

x2apic physical mode, vanilla

Dry-run: 0, 3135933 ns
Self-IPI: 8572670, 17901757 ns
Normal IPI: 226444334, 255421709 ns
Broadcast IPI: 0, 19845070887 ns
Broadcast lock: 0, 19827383656 ns

x2apic physical mode, pv-ipi

Dry-run: 0, 2446381 ns
Self-IPI: 6788217, 15021056 ns
Normal IPI: 219454441, 249583458 ns
Broadcast IPI: 0, 7806540019 ns => 154% performance boost
Broadcast lock: 0, 9143618799 ns

Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
---
Documentation/virtual/kvm/hypercalls.txt | 6 ++++++
arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++
2 files changed, 42 insertions(+)

diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
index a890529..a771ee8 100644
--- a/Documentation/virtual/kvm/hypercalls.txt
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -121,3 +121,9 @@ compute the CLOCK_REALTIME for its clock, at the same instant.

Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
+
+6. KVM_HC_SEND_IPI
+------------------------
+Architecture: x86
+Status: active
+Purpose: Hypercall used to send IPIs.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0046aa7..46657ef 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6689,6 +6689,39 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
}

+/*
+ * Return 0 if successfully added and 1 if discarded.
+ */
+static int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
+ unsigned long ipi_bitmap_high, int vector)
+{
+ int i;
+ struct kvm_apic_map *map;
+ struct kvm_vcpu *vcpu;
+ struct kvm_lapic_irq irq = {
+ .delivery_mode = APIC_DM_FIXED,
+ .vector = vector,
+ };
+
+ rcu_read_lock();
+ map = rcu_dereference(kvm->arch.apic_map);
+
+ for_each_set_bit(i, &ipi_bitmap_low, BITS_PER_LONG) {
+ vcpu = map->phys_map[i]->vcpu;
+ if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+ return 1;
+ }
+
+ for_each_set_bit(i, &ipi_bitmap_high, BITS_PER_LONG) {
+ vcpu = map->phys_map[i + BITS_PER_LONG]->vcpu;
+ if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+ return 1;
+ }
+
+ rcu_read_unlock();
+ return 0;
+}
+
void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
{
vcpu->arch.apicv_active = false;
@@ -6737,6 +6770,9 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
case KVM_HC_CLOCK_PAIRING:
ret = kvm_pv_clock_pairing(vcpu, a0, a1);
break;
+ case KVM_HC_SEND_IPI:
+ ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2);
+ break;
#endif
default:
ret = -KVM_ENOSYS;
--
2.7.4