Re: [PATCH v2] KVM: nVMX: Fix attempting to emulate "Acknowledge interrupt on exit" when there is no interrupt which L1 requires to inject to L2

From: Radim KrÄmÃÅ
Date: Thu Aug 03 2017 - 09:46:46 EST


2017-07-31 19:25-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
> CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7
> RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
> Call Trace:
> vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
> ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
> kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm]
> ? vmx_vcpu_load+0x1be/0x220 [kvm_intel]
> ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
> kvm_vcpu_ioctl+0x340/0x700 [kvm]
> ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
> ? __fget+0xfc/0x210
> do_vfs_ioctl+0xa4/0x6a0
> ? __fget+0x11d/0x210
> SyS_ioctl+0x79/0x90
> do_syscall_64+0x8f/0x750
> ? trace_hardirqs_on_thunk+0x1a/0x1c
> entry_SYSCALL64_slow_path+0x25/0x25
>
> This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which
> means that tells the kernel to not make use of any IOAPICs that may be present
> in the system.
>
> Actually external_intr variable in nested_vmx_vmexit() is the req_int_win
> variable passed from vcpu_enter_guest() which means that the L0's userspace
> requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) &&
> L0's userspace reqeusts an irq window) is true, so there is no interrupt which
> L1 requires to inject to L2, we should not attempt to emualte "Acknowledge
> interrupt on exit" for the irq window requirement in this scenario.
>
> This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit"
> if there is no L1 requirement to inject an interrupt to L2.
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> ---
> v1 -> v2:
> * update patch description
> * check nested_exit_intr_ack_set() first
>
> arch/x86/kvm/vmx.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 2737343..c5a0ab5 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11118,8 +11118,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
>
> vmx_switch_vmcs(vcpu, &vmx->vmcs01);
>
> - if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> - && nested_exit_intr_ack_set(vcpu)) {

I've added a TODO comment so it's clearer that we should not be here if
there is no interrupt.

> + if (nested_exit_intr_ack_set(vcpu) &&
> + exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> + kvm_cpu_has_interrupt(vcpu)) {
> int irq = kvm_cpu_get_interrupt(vcpu);
> WARN_ON(irq < 0);
> vmcs12->vm_exit_intr_info = irq |

Changed the indentation to the original alignment.
Please don't use 1 tab -- the condition and body meld, which makes it
harder to read. (2 tabs would be ok too.)

And the subject was way too long, so I changed it to
KVM: nVMX: Fix interrupt window request with "Acknowledge interrupt on exit"

Applied as it results in better behavior, even if it still is incorrect,
thanks.