On 12/02/2016 17:21, Suravee Suthikulpanit wrote:
Hi Paolo,
On 02/12/2016 10:55 PM, Paolo Bonzini wrote:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.cI think this is not necessary. What you need is to make kvm_lapic's
index 4244c2b..2def290 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8087,7 +8087,9 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
if (is_guest_mode(vcpu) && kvm_x86_ops->check_nested_events)
kvm_x86_ops->check_nested_events(vcpu, false);
- return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
+ return (kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu) ||
+ (kvm_x86_ops->apicv_intr_pending &&
+ kvm_x86_ops->apicv_intr_pending(vcpu)));
}
regs field point to the backing page. Then when the processor writes to
IRR, kvm_apic_has_interrupt (called through kvm_vcpu_has_events) will
see it.
avic_pending_cnt shouldn't be necessary either.
Paolo
I assume that the halt_poll_ns mechanism would have been already enabled by default as off commit 93c9247cfd1e608e262274616a28632681abb2d3.
So, the other thing I am using the avic_pending_cnt for is for the part
2 of the series (to enable AVIC support in IOMMU) that I am planning to
send out later. However, it might be good to discuss this at this point.
It's better to discuss it later. For now, I would prefer the AVIC
patches to be as clean as possible, and not know about the IOMMU at all.
Also, there are a lot of assumptions about how to use kvm_lapic's regs
field for APIC virtualization---dating back to when Intel only
virtualized the TPR field. Deviating for that would be a recipe for
trouble. :)
Regarding the IOMMU, I'm actually very happy with the way the Intel VT-d
posted interrupts patches worked out, so I would be even more happy if
everything you do fits in the same scheme and reuses the same hooks! :D
When the IOMMU cannot inject interrupts into the guest vcpu due to it is
not running (therefore, it cannot doorbell the vcpu directly), it logs
the interrupt in the GA log buffer.
Then it generates interrupt to
notify the IOMMU driver that it needs to handle the log entry. Here, the
IOMMU driver will end up notifying the SVM to scheduling the VCPU in to
process interrupt.
Here, I have run into issue where the vcpu often goes into idle (i.e.
scheduled out), and ended up causing IOMMU to generate a lot of the
entries in the GA log. This really hurts device pass-through performance
(e.g. for XGBE NIC).
So, what I ended up experimenting with is to set the avic_pending_cnt to
a larger value (i.e. avic_ga_log_threshold) whenever we processing the
GA log entry. The intention is to delay the vcpu schedule out in
expecting that there might be more interrupts coming in soon. I also
make this threshold value tunable as a module_param.
This actually works well in my experiment, where I can actually get
about 5% speed up in my netperf test on XGBE NIC pass-through test.
However, I am not sure if this is an acceptable approach. Actually, I
think it's similar to the halt_poll_ns, but specifically for IOMMU GA
log in this case.
Have you retested now that the halt_poll_ns mechanism is dynamic and
enabled by default? If I read patch 9 right, halt_poll_ns would delay
vcpu_put and IsRunning=0. Hopefully this is enough to avoid this kind
of notification and make the issue moot.
Paolo