Re: [PART1 RFC 6/9] svm: Add interrupt injection via AVIC

From: Suravee Suthikulpanit
Date: Fri Feb 19 2016 - 06:58:14 EST


Hi Paolo,

On 2/13/16 01:19, Paolo Bonzini wrote:


On 12/02/2016 17:21, Suravee Suthikulpanit wrote:
Hi Paolo,

On 02/12/2016 10:55 PM, Paolo Bonzini wrote:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4244c2b..2def290 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8087,7 +8087,9 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
kvm_x86_ops->check_nested_events(vcpu, false);

- return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
+ return (kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu) ||
+ (kvm_x86_ops->apicv_intr_pending &&
+ kvm_x86_ops->apicv_intr_pending(vcpu)));
}
I think this is not necessary. What you need is to make kvm_lapic's
regs field point to the backing page. Then when the processor writes to
IRR, kvm_apic_has_interrupt (called through kvm_vcpu_has_events) will
see it.

avic_pending_cnt shouldn't be necessary either.

Paolo

Actually, I also found out during another benchmark (running tar xf linux.tar.gz) on the VM w/ multiple cpus, that the performance is quite bad due to large amount of AVIC_INCOMP_IPI vmexit for to target not running. The same issue does not happen with 1 vcpu, or taskset the tar process to one vcpu, or if I put in the logic above in kvm_arch_vcpu_runnable() to delay the halting.


So, the other thing I am using the avic_pending_cnt for is for the part
2 of the series (to enable AVIC support in IOMMU) that I am planning to
send out later. However, it might be good to discuss this at this point.

It's better to discuss it later. For now, I would prefer the AVIC
patches to be as clean as possible, and not know about the IOMMU at all.
Also, there are a lot of assumptions about how to use kvm_lapic's regs
field for APIC virtualization---dating back to when Intel only
virtualized the TPR field. Deviating for that would be a recipe for
trouble. :)

Regarding the IOMMU, I'm actually very happy with the way the Intel VT-d
posted interrupts patches worked out, so I would be even more happy if
everything you do fits in the same scheme and reuses the same hooks! :D

When the IOMMU cannot inject interrupts into the guest vcpu due to it is
not running (therefore, it cannot doorbell the vcpu directly), it logs
the interrupt in the GA log buffer.

Then it generates interrupt to
notify the IOMMU driver that it needs to handle the log entry. Here, the
IOMMU driver will end up notifying the SVM to scheduling the VCPU in to
process interrupt.

Here, I have run into issue where the vcpu often goes into idle (i.e.
scheduled out), and ended up causing IOMMU to generate a lot of the
entries in the GA log. This really hurts device pass-through performance
(e.g. for XGBE NIC).

So, what I ended up experimenting with is to set the avic_pending_cnt to
a larger value (i.e. avic_ga_log_threshold) whenever we processing the
GA log entry. The intention is to delay the vcpu schedule out in
expecting that there might be more interrupts coming in soon. I also
make this threshold value tunable as a module_param.

This actually works well in my experiment, where I can actually get
about 5% speed up in my netperf test on XGBE NIC pass-through test.
However, I am not sure if this is an acceptable approach. Actually, I
think it's similar to the halt_poll_ns, but specifically for IOMMU GA
log in this case.

Have you retested now that the halt_poll_ns mechanism is dynamic and
enabled by default? If I read patch 9 right, halt_poll_ns would delay
vcpu_put and IsRunning=0. Hopefully this is enough to avoid this kind
of notification and make the issue moot.

Paolo

I assume that the halt_poll_ns mechanism would have been already enabled by default as off commit 93c9247cfd1e608e262274616a28632681abb2d3.

So, I have tried playing with halt_poll_ns, halt_poll_ns_[grow|shrink], but it doesn't seem to help much.

Thanks,
Suravee