Re: [PATCH 1/5] KVM: x86: avoid atomic operations on APICv vmentry
From: Michael S. Tsirkin
Date: Sat Oct 15 2016 - 23:21:50 EST
On Fri, Oct 14, 2016 at 08:21:27PM +0200, Paolo Bonzini wrote:
> On some benchmarks (e.g. netperf with ioeventfd disabled), APICv
> posted interrupts turn out to be slower than interrupt injection via
> KVM_REQ_EVENT.
>
> This patch optimizes a bit the IRR update, avoiding expensive atomic
> operations in the common case where PI.ON=0 at vmentry or the PIR vector
> is mostly zero. This saves at least 20 cycles (1%) per vmexit, as
> measured by kvm-unit-tests' inl_from_qemu test (20 runs):
>
> | enable_apicv=1 | enable_apicv=0
> | mean stdev | mean stdev
> ----------|-----------------|------------------
> before | 5826 32.65 | 5765 47.09
> after | 5809 43.42 | 5777 77.02
>
> Of course, any change in the right column is just placebo effect. :)
> The savings are bigger if interrupts are frequent.
>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/lapic.c | 6 ++++--
> arch/x86/kvm/vmx.c | 9 ++++++++-
> 2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 23b99f305382..63a442aefc12 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -342,9 +342,11 @@ void __kvm_apic_update_irr(u32 *pir, void *regs)
> u32 i, pir_val;
>
> for (i = 0; i <= 7; i++) {
> - pir_val = xchg(&pir[i], 0);
> - if (pir_val)
> + pir_val = READ_ONCE(pir[i]);
> + if (pir_val) {
> + pir_val = xchg(&pir[i], 0);
> *((u32 *)(regs + APIC_IRR + i * 0x10)) |= pir_val;
> + }
> }
> }
> EXPORT_SYMBOL_GPL(__kvm_apic_update_irr);
gcc doesn't seem to unroll this loop and it's
probably worth unrolling it
The following seems to do the trick for me on upstream - I didn't
benchmark it though. Is there a kvm unit test for interrupts?
--->
kvm: unroll the loop in __kvm_apic_update_irr.
This is hot data path in interrupt-rich workloads, worth unrolling.
Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b62c852..0c3462c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -337,7 +337,8 @@ static u8 count_vectors(void *bitmap)
return count;
}
-void __kvm_apic_update_irr(u32 *pir, void *regs)
+void __attribute__((optimize("unroll-loops")))
+__kvm_apic_update_irr(u32 *pir, void *regs)
{
u32 i, pir_val;