RE: [PATCH v2 1/2] KVM: x86: Use vector-hashing to deliver lowest-priority interrupts
From: Wu, Feng
Date: Mon Jan 18 2016 - 00:20:07 EST
Hi Radim,
Sorry for the late response, I was blocked by another task during the last
couple of weeks.
> -----Original Message-----
> From: Radim Krčmář [mailto:rkrcmar@xxxxxxxxxx]
> Sent: Thursday, December 24, 2015 1:20 AM
> To: Wu, Feng <feng.wu@xxxxxxxxx>
> Cc: pbonzini@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 1/2] KVM: x86: Use vector-hashing to deliver lowest-
> priority interrupts
>
> 2015-12-16 09:37+0800, Feng Wu:
> > Use vector-hashing to deliver lowest-priority interrupts, As an
> > example, modern Intel CPUs in server platform use this method to
> > handle lowest-priority interrupts.
> >
> > Signed-off-by: Feng Wu <feng.wu@xxxxxxxxx>
> > ---
> > diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
> > @@ -78,13 +83,25 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm,
> struct kvm_lapic *src,
> > r = 0;
> > r += kvm_apic_set_irq(vcpu, irq, dest_map);
> > } else if (kvm_lapic_enabled(vcpu)) {
> > - if (!lowest)
> > - lowest = vcpu;
> > - else if (kvm_apic_compare_prio(vcpu, lowest) < 0)
> > - lowest = vcpu;
> > + if (!kvm_vector_hashing_enabled()) {
> > + if (!lowest)
> > + lowest = vcpu;
> > + else if (kvm_apic_compare_prio(vcpu, lowest)
> < 0)
> > + lowest = vcpu;
> > + } else {
> > + __set_bit(vcpu->vcpu_id, dest_vcpu_bitmap);
> > + dest_vcpus++;
> > + }
> > }
> > }
> >
> > + if (dest_vcpus != 0) {
> > + idx = kvm_vector_2_index(irq->vector, dest_vcpus,
> > + dest_vcpu_bitmap,
> KVM_MAX_VCPUS);
> > +
> > + lowest = kvm_get_vcpu(kvm, idx - 1);
>
> This is going to fail with sparse topologies (e.g. 3 cores per socket).
> vcpu_id = initial APIC ID and kvm_get_vcpu() uses a compressed array
> that has kvm->online_vcpus elements, so we could overflow.
>
> The 'i' in kvm_for_each_vcpu() could be used for the bitmap.
> (kvm_get_vcpu_by_id() instead of kvm_get_vcpu() is slightly worse.)
>
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > @@ -678,6 +678,22 @@ bool kvm_apic_match_dest(struct kvm_vcpu
> *vcpu, struct kvm_lapic *source,
> > bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
> > struct kvm_lapic_irq *irq, int *r, unsigned long *dest_map)
> > {
> > @@ -731,17 +747,38 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm
> *kvm, struct kvm_lapic *src,
> > + if (!kvm_vector_hashing_enabled()) {
> | [...]
> > + } else {
> > + int idx = 0;
> > + unsigned int dest_vcpus = 0;
>
> Now that we don't need to check for present/enabled LAPICs, I think it
> would be better to solve this by assuming that all selected LAPICs are
> enabled, so the n-th target is decided only based on vector and
> destination.
>
> > + for_each_set_bit(i, &bitmap, 16) {
> > + if (!dst[i]
> && !kvm_lapic_enabled(dst[i]->vcpu)) {
> > + __clear_bit(i, &bitmap);
> > + continue;
> > + }
> > + }
>
> => we could skip this loop.
>
> > +
> > + dest_vcpus = hweight16(bitmap);
> > +
> > + if (dest_vcpus != 0) {
> > + idx = kvm_vector_2_index(irq->vector,
> > + dest_vcpus, &bitmap, 16);
> > +
> > + bitmap = 0;
> > + __set_bit(idx-1, &bitmap);
>
> And set just this bit.
>
> The drawback is that buggy software that included hardware disabled
> APICs to lowest priority destinations could stop working ...
Yes, if guest hardware disabled the APIC and we don't check "!dst[i]" above,
interrupts could be still delivered to the hardware disabled APIC, right?
> Do you think it's too risky?
If you think the first loop have big bad impact on the performance, I think
your suggestion above is okay, since it is software's responsibility to make
sure the LAPIC is hardware enabled before receiving the interrupt. However,
this will make the vector-hashing lowest-priority handling slightly different
compare to round-robin, since RR checks "!dst[i]" before injecting the
interrupts. What is your opinion about it? Thanks a lot!
Thanks,
Feng
>
> > + }
> > }
>
> (This is basically the same as converting the message to a fixed delivery
> to n-th bit beforehand, so it might be reasonable to to apply something
> similar to simplify the slow path as well. Mixed flat/cluster/x2APIC
> mode makes me suspect that it won't be reasonable.)