Re: [PART1 RFC 6/9] svm: Add interrupt injection via AVIC

From: Suravee Suthikulpanit
Date: Fri Feb 12 2016 - 11:21:42 EST


Hi Paolo,

On 02/12/2016 10:55 PM, Paolo Bonzini wrote:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>index 4244c2b..2def290 100644
>--- a/arch/x86/kvm/x86.c
>+++ b/arch/x86/kvm/x86.c
>@@ -8087,7 +8087,9 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
> kvm_x86_ops->check_nested_events(vcpu, false);
>
>- return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
>+ return (kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu) ||
>+ (kvm_x86_ops->apicv_intr_pending &&
>+ kvm_x86_ops->apicv_intr_pending(vcpu)));
> }
I think this is not necessary. What you need is to make kvm_lapic's
regs field point to the backing page. Then when the processor writes to
IRR, kvm_apic_has_interrupt (called through kvm_vcpu_has_events) will
see it.

avic_pending_cnt shouldn't be necessary either.

Paolo

So, the other thing I am using the avic_pending_cnt for is for the part 2 of the series (to enable AVIC support in IOMMU) that I am planning to send out later. However, it might be good to discuss this at this point.

When the IOMMU cannot inject interrupts into the guest vcpu due to it is not running (therefore, it cannot doorbell the vcpu directly), it logs the interrupt in the GA log buffer. Then it generates interrupt to notify the IOMMU driver that it needs to handle the log entry. Here, the IOMMU driver will end up notifying the SVM to scheduling the VCPU in to process interrupt.

Here, I have run into issue where the vcpu often goes into idle (i.e. scheduled out), and ended up causing IOMMU to generate a lot of the entries in the GA log. This really hurts device pass-through performance (e.g. for XGBE NIC).

So, what I ended up experimenting with is to set the avic_pending_cnt to a larger value (i.e. avic_ga_log_threshold) whenever we processing the GA log entry. The intention is to delay the vcpu schedule out in expecting that there might be more interrupts coming in soon. I also make this threshold value tunable as a module_param.

This actually works well in my experiment, where I can actually get about 5% speed up in my netperf test on XGBE NIC pass-through test.
However, I am not sure if this is an acceptable approach. Actually, I think it's similar to the halt_poll_ns, but specifically for IOMMU GA log in this case.

Let me know what you think.

Thanks,
Suravee