Re: [PATCH v2 1/2] KVM: VMX: Disable posted interrupts for odd IRQs

From: Alexander Graf
Date: Wed Sep 04 2019 - 11:37:33 EST

On 04.09.19 16:40, Sean Christopherson wrote:
On Wed, Sep 04, 2019 at 03:35:10PM +0200, Alexander Graf wrote:
We can easily route hardware interrupts directly into VM context when
they target the "Fixed" or "LowPriority" delivery modes.

However, on modes such as "SMI" or "Init", we need to go via KVM code
to actually put the vCPU into a different mode of operation, so we can
not post the interrupt

Add code in the VMX PI logic to explicitly refuse to establish posted
mappings for advanced IRQ deliver modes. This reflects the logic in
__apic_accept_irq() which also only ever passes Fixed and LowPriority
interrupts as posted interrupts into the guest.

This fixes a bug I have with code which configures real hardware to
inject virtual SMIs into my guest.

Signed-off-by: Alexander Graf <graf@xxxxxxxxxx>
Reviewed-by: Liran Alon <liran.alon@xxxxxxxxxx>


v1 -> v2:

- Make error message more unique
- Update commit message to point to __apic_accept_irq()
arch/x86/kvm/vmx/vmx.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 570a233e272b..8029fe658c30 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7401,6 +7401,28 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
+ switch (irq.delivery_mode) {
+ case dest_Fixed:
+ case dest_LowestPrio:
+ break;
+ default:
+ /*
+ * For non-trivial interrupt events, we need to go
+ * through the full KVM IRQ code, so refuse to take
+ * any direct PI assignments here.
+ */

IMO, a beefy comment is unnecessary, anyone that is digging through this
code has hopefully read the PI spec or at least understands the basic
concepts. I.e. it should be obvious that PI can't be used for SMI, etc...

+ ret = irq_set_vcpu_affinity(host_irq, NULL);
+ if (ret < 0) {
+ printk(KERN_INFO
+ "non-std IRQ failed to recover, irq: %u\n",
+ host_irq);
+ goto out;
+ }
+ continue;

Using a switch to filter out two types is a bit of overkill. It also

The switch should compile into the same as the if() below, it's just a matter of being more verbose in code.

probably makes sense to perform the deliver_mode checks before calling
kvm_intr_is_single_vcpu(). Why not simply something like this? The
existing comment and error message are even generic enough to keep as is.

Ok, so how about this, even though it goes against Liran's comment on the combined debug print?

If you think it's reasonable despite the broken formatting, I'll be happy to fold the patches and submit as v3.


diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 44a5ce57a905..55f68fb0d791 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1581,6 +1581,12 @@ bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq,
void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,
struct kvm_lapic_irq *irq);

+static inline bool kvm_irq_is_generic(struct kvm_lapic_irq *irq)
+ return (irq->delivery_mode == dest_Fixed ||
+ irq->delivery_mode == dest_LowestPrio);
static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
if (kvm_x86_ops->vcpu_blocking)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1f220a85514f..34cc59518cbb 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5260,7 +5260,8 @@ get_pi_vcpu_info(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e,

kvm_set_msi_irq(kvm, e, &irq);

- if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
+ if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu) ||
+ !kvm_irq_is_generic(&irq)) {
pr_debug("SVM: %s: use legacy intr remap mode for irq %u\n",
__func__, irq.vector);
return -1;
@@ -5314,6 +5315,7 @@ static int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
* 1. When cannot target interrupt to a specific vcpu.
* 2. Unsetting posted interrupt.
* 3. APIC virtialization is disabled for the vcpu.
+ * 4. IRQ has extended delivery mode (SMI, INIT, etc)
if (!get_pi_vcpu_info(kvm, e, &vcpu_info, &svm) && set &&
kvm_vcpu_apicv_active(&svm->vcpu)) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 570a233e272b..69f53809c7bb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7382,10 +7382,14 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
* irqbalance to make the interrupts single-CPU.
* We will support full lowest-priority interrupt later.
+ *
+ * In addition, we can only inject generic interrupts using
+ * the PI mechanism, refuse to route others through it.

kvm_set_msi_irq(kvm, e, &irq);
- if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
+ if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu) ||
+ !kvm_irq_is_generic(&irq)) {
* Make sure the IRTE is in remapped mode if
* we don't handle it in posted mode.

Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879