Re: [PATCH v2 12/15] kvm: i8254: Check LAPIC EOI pending when injecting irq on SVM AVIC
From: Alexander Graf
Date: Tue Aug 27 2019 - 05:11:07 EST
On 26.08.19 22:46, Suthikulpanit, Suravee wrote:
Alex,
On 8/19/2019 5:42 AM, Alexander Graf wrote:
OnÂ15.08.19Â18:25,ÂSuthikulpanit,ÂSuraveeÂwrote:
ACKÂnotifiersÂdon'tÂworkÂwithÂAMDÂSVMÂw/ÂAVICÂwhenÂtheÂPITÂinterrupt
isÂdeliveredÂasÂedge-triggeredÂfixedÂinterruptÂsinceÂAMDÂprocessors
cannotÂexitÂonÂEOIÂforÂtheseÂinterrupts.
AddÂcodeÂtoÂcheckÂLAPICÂpendingÂEOIÂbeforeÂinjectingÂanyÂpendingÂPIT
interruptÂonÂAMDÂSVMÂwhenÂAVICÂisÂactivated.
Signed-off-by:ÂSuraveeÂSuthikulpanit <suravee.suthikulpanit@xxxxxxx>
---
ÂÂarch/x86/kvm/i8254.cÂ|Â31Â+++++++++++++++++++++++++------
ÂÂ1ÂfileÂchanged,Â25Âinsertions(+),Â6Âdeletions(-)
diffÂ--gitÂa/arch/x86/kvm/i8254.cÂb/arch/x86/kvm/i8254.c
indexÂ4a6dc54..31c4a9bÂ100644
---Âa/arch/x86/kvm/i8254.c
+++Âb/arch/x86/kvm/i8254.c
@@Â-34,10Â+34,12Â@@
ÂÂ#includeÂ<linux/kvm_host.h>
ÂÂ#includeÂ<linux/slab.h>
+#includeÂ<asm/virtext.h>
ÂÂ#includeÂ"ioapic.h"
ÂÂ#includeÂ"irq.h"
ÂÂ#includeÂ"i8254.h"
+#includeÂ"lapic.h"
ÂÂ#includeÂ"x86.h"
ÂÂ#ifndefÂCONFIG_X86_64
@@Â-236,6Â+238,12Â@@ÂstaticÂvoidÂdestroy_pit_timer(structÂkvm_pitÂ*pit)
ÂÂÂÂÂÂkthread_flush_work(&pit->expired);
ÂÂ}
+staticÂinlineÂvoidÂkvm_pit_reset_reinject(structÂkvm_pitÂ*pit)
+{
+ÂÂÂÂatomic_set(&pit->pit_state.pending,Â0);
+ÂÂÂÂatomic_set(&pit->pit_state.irq_ack,Â1);
+}
+
ÂÂstaticÂvoidÂpit_do_work(structÂkthread_workÂ*work)
ÂÂ{
ÂÂÂÂÂÂstructÂkvm_pitÂ*pitÂ=Âcontainer_of(work,ÂstructÂkvm_pit,Âexpired);
@@Â-244,6Â+252,23Â@@ÂstaticÂvoidÂpit_do_work(structÂkthread_workÂ*work)
ÂÂÂÂÂÂintÂi;
ÂÂÂÂÂÂstructÂkvm_kpit_stateÂ*psÂ=Â&pit->pit_state;
+ÂÂÂÂ/*
+ÂÂÂÂÂ*ÂSince,ÂAMDÂSVMÂAVICÂacceleratesÂwriteÂaccessÂtoÂAPICÂEOI
+ÂÂÂÂÂ*ÂregisterÂforÂedge-triggerÂinterrupts.ÂPITÂwillÂnotÂbeÂable
+ÂÂÂÂÂ*ÂtoÂreceiveÂtheÂIRQÂACKÂnotifierÂandÂwillÂalwaysÂbeÂzero.
+ÂÂÂÂÂ*ÂTherefore,ÂweÂcheckÂifÂanyÂLAPICÂEOIÂpendingÂforÂvectorÂ0
+ÂÂÂÂÂ*ÂandÂresetÂirq_ackÂifÂnoÂpending.
+ÂÂÂÂÂ*/
+ÂÂÂÂifÂ(cpu_has_svm(NULL)Â&&Âkvm->arch.apicv_stateÂ==ÂAPICV_ACTIVATED)Â{
+ÂÂÂÂÂÂÂÂintÂeoiÂ=Â0;
+
+ÂÂÂÂÂÂÂÂkvm_for_each_vcpu(i,Âvcpu,Âkvm)
+ÂÂÂÂÂÂÂÂÂÂÂÂifÂ(kvm_apic_pending_eoi(vcpu,Â0))
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂeoi++;
+ÂÂÂÂÂÂÂÂifÂ(!eoi)
+ÂÂÂÂÂÂÂÂÂÂÂÂkvm_pit_reset_reinject(pit);
InÂwhichÂcaseÂwouldÂeoiÂbeÂ!=Â0ÂwhenÂAPIC-VÂisÂactive?
That would be the case when guest has not processed and/or still processing the interrupt.
Once the guest writes to APIC EOI register for edge-triggered interrupt for vector 0,
and the AVIC hardware accelerated the access by clearing the highest priority ISR bit,
then the eoi should be zero.
Thinking about this a bit more, you're basically saying the irq ack
notifier never triggers because we don't see the EOI register write, but
we can determine the state asynchronously.
The irqfd code also uses the ack notifier for level irq reinjection.
Will that break as well?
Wouldn't it make more sense to try to either maintain the ack notifier
API or remove it completely if we can't find a way to make it work with
APIC-V?
So what if we detect that an IRQ vector we're injecting for has an irq
notifier? If it does, we set up / start:
* an hrtimer that polls for EOI on that vector
* a flag so that every vcpu on exit checks for EOI on that vector
* a direct call from pit_do_work to check on it as well
Each of them would go through a single code path that then calls the
ack_notifier.
That way we should be able to just maintain the old API and not get into
unpleasant surprises that only manifest on a tiny faction of systems, right?
Alternatively, feel free to remove the ack logic altogether and move all
users of it to different mechanisms (check in do_work here, additional
timer in irqfd probably).
Let's try to be as consistent as possible across different host
platforms. Otherwise the test matrix just explodes.
Alex
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879