Hi Suravee,
Please see below..
On 03/11/2019 01:38 PM, Suthikulpanit, Suravee wrote:
Hi Oren,I meant cause ID=1 in IPI Delivery Failure Cause (SDM rev 3.30, sep 2018, Table 15-28):
Sorry for delay response.
On 3/5/19 1:15 AM, Oren Twaig wrote:
Hello Suravee,I believe you are referring to the "isRunning" (IR) bit is in the
According to AMD's SDM, the target-not-running incomplete
ipi exit is only received if any of the destination cpus had the
not-running bit set in the avic backing page.
AVIC physical APIC ID table entry.
"
1: IPI Target Not Running:
IsRunning bit of the target for a
Singlecast/Broadcast/Multicast IPI is not set in
the physical APIC ID table.
"
However, not before the CPU _already_ set the relevant IRR bitNot sure what you meant here.
in all these cpus.
Here is the full snippet from the specifications:
"
5.For every valid destination:
- Atomically set the appropriate IRR bit in each of the destinationsâ
 vAPIC backing page.
- Check the IsRunning status of each destination.
- If the destination IsRunning bit is set, send a doorbell message
 using the host physical core number from the Physical APIC ID table.
6. If any destinations are identified as not currently scheduled on
Âa physical core (i.e., the IsRunning
Âbit for that virtual processor is not set), cause a #VMEXIT.
"
According to the specification above, the HW should first
set the appropriate bit in the IRR (Interrupt Request Register)
_before_ causing VMEXIT of IPI-delivery-not-completed
with ID=1 (Target not running).
Exactly. Only what needed here is *only* to wakeup the vcpu B. Why ? because
In this change, the patch forces KVM to send another interruptThis patch was introduced to fix an issue where the SVM driver tries to
to the vcpu whether SVM already did that or not. Which means
the vcpu/s, under some conditions, can get an EXTRA interrupt
it never intended to get >> Example:
  1. vcpu B: Is in "not-running" state.
  2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
  3. vcpu A: SVM updates vcpu B IRR with bit 80
  4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
  5. vcpu A: Now stops executing any code @ hypervisor level.
  6. vcpu B: Due to another interrupt (like lapic timer)
 Â resumes running the guest. While handling interrupts,
 Â it also handles interrupt vector 80 (as it's in his IRR)
  7. vcpu A: resumes executing the below code and sends
 Â an _additional_interrupt to vcpu B.
Overall, vcpu B got two interrupts. The second is unwanted and
not documented in the system architecture.
Can you please elaborate more to why the implementation
below conflict with the specifications (which was the code
before this commit) ?
handle the step 5 above by scheduling vcpu B into _running_ state to handle
the IPI from vcpu A. However, prior to this patch, vcpu B was never get
scheduled to run unless there are other interrupts (e.g. timer).
the apic of vcpu B _already_ contains the interrupt in the pending
IRR. Than, once vcpu B will run it will process the IRR which contains
the vector placed by the HW and will deliver it.
This should not be the case as Vcpu B should have been running regardlessThe example of vcpu A that stops executing is just to highlight that
of other interrupts. So, I don't think step 6 and 7 above are correct.
the code can't depend on that the kvm code of vcpu A will finish the
ICR "fake" call before vcpu B runs (beacuse of any interrupt) and process
that IRR request placed by the HW.
I still think there a bug here where vcpu B will get two interrupts
The issue was caused by the apic->irr_pending not set to true when trying to
get vcpu B scheduled. This flag is checked in apic_find_highest_irr() before
searching for the highest bit.
To fix the issue, I decided to leverage the existing emulation code for
ICR and ICR2, which in turn calls apic_send_ipi() to deliver interrupt to vpu B.
However, looking a bit more closely, I notice the logic in svm_deliver_avic_intr()
should also have been changed from kvm_vcpu_wake_up() to kvm_vcpu_kick()
since the latter will result in clearing the IRR bit for the IPI vector
when trying to send IPI as part of the following call path.
ÂÂÂ vcpu_enter_guest()
ÂÂÂÂÂ |-- inject_pending_event()
ÂÂÂÂÂÂÂ |-- kvm_cpu_get_interrupt()
ÂÂÂÂÂÂÂÂÂ |--Â kvm_get_apic_interrupt()
ÂÂÂÂÂÂÂÂÂÂÂ |-- apic_clear_irr()
ÂÂÂÂÂÂÂÂÂÂÂ |-- apic_set_isr()
ÂÂÂÂÂÂÂÂÂÂÂ |-- apic_update_ppr() ....
Please see the patch below.
Not sure if this would address the problem you are seeing.
instead of one.
Thanks,
Oren
Thanks,
Suravee
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 24dfa6a93711..d2841c3dbc04 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5219,11 +5256,13 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
ÂÂÂÂÂÂÂÂÂ kvm_lapic_set_irr(vec, vcpu->arch.apic);
ÂÂÂÂÂÂÂÂÂ smp_mb__after_atomic();
-ÂÂÂÂÂÂ if (avic_vcpu_is_running(vcpu))
+ÂÂÂÂÂÂ if (avic_vcpu_is_running(vcpu)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ wrmsrl(SVM_AVIC_DOORBELL,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kvm_cpu_get_apicid(vcpu->cpu));
-ÂÂÂÂÂÂ else
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kvm_vcpu_wake_up(vcpu);
+ÂÂÂÂÂÂ } else {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kvm_make_request(KVM_REQ_EVENT, vcpu);
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kvm_vcpu_kick(vcpu);
+ÂÂÂÂÂÂ }
ÂÂ }
ÂÂ static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)