Re: [PATCH 2/2] KVM: x86: check_nested_events if there is an injectable NMI

From: Cathy Avery
Date: Thu Apr 23 2020 - 11:37:01 EST


On 4/23/20 10:42 AM, Sean Christopherson wrote:
On Tue, Apr 14, 2020 at 04:11:07PM -0400, Cathy Avery wrote:
With NMI intercept moved to check_nested_events there is a race
condition where vcpu->arch.nmi_pending is set late causing
How is nmi_pending set late? The KVM_{G,S}ET_VCPU_EVENTS paths can't set
it because the current KVM_RUN thread holds the mutex, and the only other
call to process_nmi() is in the request path of vcpu_enter_guest, which has
already executed.

You will have to forgive me as I am new to KVM and any help would be most appreciated. This is what I noticed when an NMI intercept is processed when it was implemented in check_nested_events.

When check_nested_events is called from inject_pending_event ... check_nested_events needs to have already been called (kvm_vcpu_running with vcpu->arch.nmi_pending = 1)Â to set up the NMI intercept and set svm->nested.exit_required. Otherwise we do not exit from the second checked_nested_events call ( code below ) with a return of -EBUSY which allows us to immediately vmexit.

ÂÂÂÂÂÂÂ /*
ÂÂÂÂÂÂÂÂ * Call check_nested_events() even if we reinjected a previous event
ÂÂÂÂÂÂÂÂ * in order for caller to determine if it should require immediate-exit
ÂÂÂÂÂÂÂÂ * from L2 to L1 due to pending L1 events which require exit
ÂÂÂÂÂÂÂÂ * from L2 to L1.
ÂÂÂÂÂÂÂÂ */

ÂÂÂÂÂÂÂ if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ r = kvm_x86_ops.check_nested_events(vcpu);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (r != 0)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return r;
ÂÂÂÂÂÂÂ }

Unfortunately when kvm_vcpu_running is called vcpu->arch.nmi_pending is not yet set.

Here is the trace snippet ( with some debug ) without the second call to check_nested_events.

Thanks,

Cathy

qemu-system-x86-2029Â [040]ÂÂ 232.168269: kvm_entry: vcpu 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168271: kvm_exit: reason EXIT_MSR rip 0x405371 info 1 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168272: kvm_nested_vmexit: rip 405371 reason EXIT_MSR info1 1 info2 0 int_info 0 int_info_err 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168273: kvm_apic: apic_write APIC_ICR2 = 0x0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168274: kvm_apic: apic_write APIC_ICR = 0x44400
Âqemu-system-x86-2029Â [040]ÂÂ 232.168275: kvm_apic_ipi: dst 0 vec 0 (NMI|physical|assert|edge|self)
Âqemu-system-x86-2029Â [040]ÂÂ 232.168277: kvm_apic_accept_irq: apicid 0 vec 0 (NMI|edge)
Âqemu-system-x86-2029Â [040]ÂÂ 232.168278: kvm_msr: msr_write 830 = 0x44400
Âqemu-system-x86-2029Â [040]ÂÂ 232.168279: bprint: svm_check_nested_events:Â svm_check_nested_events reinj = 0, exit_req = 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168279: bprint: svm_check_nested_events:Â svm_check_nested_events nmi pending = 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168279: bputs: vcpu_enter_guest:Â inject_pending_event 1
Âqemu-system-x86-2029Â [040]ÂÂ 232.168279: bprint: svm_check_nested_events: svm_check_nested_events reinj = 0, exit_req = 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168279: bprint: svm_check_nested_events: svm_check_nested_events nmi pending = 1
Âqemu-system-x86-2029Â [040]ÂÂ 232.168280: bprint: svm_nmi_allowed: svm_nmi_allowed ret 1
Âqemu-system-x86-2029Â [040]ÂÂ 232.168280: bputs: svm_inject_nmi: svm_inject_nmi
Âqemu-system-x86-2029Â [040]ÂÂ 232.168280: bprint: vcpu_enter_guest:Â nmi_pending 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168281: kvm_entry: vcpu 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168282: kvm_exit: reason EXIT_NMI rip 0x405373 info 1 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168284: kvm_nested_vmexit_inject: reason EXIT_NMI info1 1 info2 0 int_info 0 int_info_err 0
Âqemu-system-x86-2029Â [040]ÂÂ 232.168285: kvm_entry: vcpu 0


the execution of check_nested_events to not setup correctly
for nested.exit_required. A second call to check_nested_events
allows the injectable nmi to be detected in time in order to
require immediate exit from L2 to L1.

Signed-off-by: Cathy Avery <cavery@xxxxxxxxxx>
---
arch/x86/kvm/x86.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 027dfd278a97..ecfafcd93536 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7734,10 +7734,17 @@ static int inject_pending_event(struct kvm_vcpu *vcpu)
vcpu->arch.smi_pending = false;
++vcpu->arch.smi_count;
enter_smm(vcpu);
- } else if (vcpu->arch.nmi_pending && kvm_x86_ops.nmi_allowed(vcpu)) {
- --vcpu->arch.nmi_pending;
- vcpu->arch.nmi_injected = true;
- kvm_x86_ops.set_nmi(vcpu);
+ } else if (vcpu->arch.nmi_pending) {
+ if (is_guest_mode(vcpu) && kvm_x86_ops.check_nested_events) {
+ r = kvm_x86_ops.check_nested_events(vcpu);
+ if (r != 0)
+ return r;
+ }
+ if (kvm_x86_ops.nmi_allowed(vcpu)) {
+ --vcpu->arch.nmi_pending;
+ vcpu->arch.nmi_injected = true;
+ kvm_x86_ops.set_nmi(vcpu);
+ }
} else if (kvm_cpu_has_injectable_intr(vcpu)) {
/*
* Because interrupts can be injected asynchronously, we are
--
2.20.1