Hello Suravee,
According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the
not-running bit set in the avic backing page. However, not
before the CPU _already_ set the relevant IRR bit in all these cpus.
In this change, the patch forces KVM to send another interrupt
to the vcpu whether SVM already did that or not. Which means
the vcpu/s, under some conditions, can get an EXTRA interrupt
it never intended to get.
Example:
 1. vcpu B: Is in "not-running" state.
 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
 3. vcpu A: SVM updates vcpu B IRR with bit 80
 4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
 5. vcpu A: Now stops executing any code @ hypervisor level.
 6. vcpu B: Due to another interrupt (like lapic timer)
ÂÂÂÂ resumes running the guest. While handling interrupts,
ÂÂÂÂ it also handles interrupt vector 80 (as it's in his IRR)
 7. vcpu A: resumes executing the below code and sends
ÂÂÂÂ an _additional_interrupt to vcpu B.
Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.
Can you please elaborate more to why the implementation
below conflict with the specifications (which was the code
before this commit) ?
Thanks,
Oren Twaig
> FromÂÂÂ "Suthikulpanit, Suravee" <>
> SubjectÂÂÂ [PATCH] svm: Fix AVIC incomplete IPI emulation
> DateÂÂÂ Tue, 22 Jan 2019 10:25:13 +0000
> share
> From: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
>
> In case of incomplete IPI with invalid interrupt type, the current
> SVM driver does not properly emulate the IPI, and fails to boot
> FreeBSD guests with multiple vcpus when enabling AVIC.
>
> Fix this by update APIC ICR high/low registers, which also
> emulate sending the IPI.
>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>
> ---
> arch/x86/kvm/svm.c | 19 ++++---------------
> 1 file changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 2aff835a65ed..8a0c9a1f6ac8 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -4504,25 +4504,14 @@ static int avic_incomplete_ipi_interception(struct vcpu_svm *svm)
> ÂÂÂÂ ÂÂÂ kvm_lapic_reg_write(apic, APIC_ICR, icrl);
> ÂÂÂÂ ÂÂÂ break;
> ÂÂÂÂ case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING: {
> -ÂÂÂ ÂÂÂ int i;
> -ÂÂÂ ÂÂÂ struct kvm_vcpu *vcpu;
> -ÂÂÂ ÂÂÂ struct kvm *kvm = svm->vcpu.kvm;
> ÂÂÂÂ ÂÂÂ struct kvm_lapic *apic = svm->vcpu.arch.apic;
>
ÂÂÂÂ ÂÂÂ /*
> -ÂÂÂ ÂÂÂ Â* At this point, we expect that the AVIC HW has already
> -ÂÂÂ ÂÂÂ Â* set the appropriate IRR bits on the valid target
> -ÂÂÂ ÂÂÂ Â* vcpus. So, we just need to kick the appropriate vcpu.
> +ÂÂÂ ÂÂÂ Â* Update ICR high and low, then emulate sending IPI,
> +ÂÂÂ ÂÂÂ Â* which is handled when writing APIC_ICR.
> ÂÂÂÂ ÂÂÂÂ */
> -ÂÂÂ ÂÂÂ kvm_for_each_vcpu(i, vcpu, kvm) {
> -ÂÂÂ ÂÂÂ ÂÂÂ bool m = kvm_apic_match_dest(vcpu, apic,
> -ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂÂ icrl & KVM_APIC_SHORT_MASK,
> - GET_APIC_DEST_FIELD(icrh),
> - ÂÂÂÂ icrl & KVM_APIC_DEST_MASK);
> -
> -ÂÂÂ ÂÂÂ ÂÂÂ if (m && !avic_vcpu_is_running(vcpu))
> - kvm_vcpu_wake_up(vcpu);
> -ÂÂÂ ÂÂÂ }
> +ÂÂÂ ÂÂÂ kvm_lapic_reg_write(apic, APIC_ICR2, icrh);
> +ÂÂÂ ÂÂÂ kvm_lapic_reg_write(apic, APIC_ICR, icrl);
> ÂÂÂÂ ÂÂÂ break;
> ÂÂÂÂ }
> ÂÂÂÂ case AVIC_IPI_FAILURE_INVALID_TARGET:
> --
> 2.17.1