Re: [PATCH v3] KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset

From: Jim Mattson
Date: Mon Mar 06 2017 - 13:19:55 EST


On Mon, Mar 6, 2017 at 4:33 AM, David Hildenbrand <david@xxxxxxxxxx> wrote:
> Am 06.03.2017 um 13:03 schrieb Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>
>> Reported by syzkaller:
>>
>> WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029
>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
>> CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>> __dump_stack lib/dump_stack.c:15 [inline]
>> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>> panic+0x1fb/0x412 kernel/panic.c:179
>> __warn+0x1c4/0x1e0 kernel/panic.c:540
>> warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
>> vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline]
>> vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324
>> kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099
>> do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128
>> __msr_io arch/x86/kvm/x86.c:2577 [inline]
>> msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614
>> kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497
>> kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721
>> vfs_ioctl fs/ioctl.c:43 [inline]
>> do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
>> SYSC_ioctl fs/ioctl.c:698 [inline]
>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>> entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> The syzkaller folks reported a nested_run_pending warning during userspace
>> clear VMX capability which is exposed to L1 before.
>>
>> The warning gets thrown while doing
>>
>> (*(uint32_t*)0x20aecfe8 = (uint32_t)0x1);
>> (*(uint32_t*)0x20aecfec = (uint32_t)0x0);
>> (*(uint32_t*)0x20aecff0 = (uint32_t)0x3a);
>> (*(uint32_t*)0x20aecff4 = (uint32_t)0x0);
>> (*(uint64_t*)0x20aecff8 = (uint64_t)0x0);
>> r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul,
>> 0x20aecfe8ul, 0, 0, 0, 0, 0, 0);
>>
>> i.e. KVM_SET_MSR ioctl with
>>
>> struct kvm_msrs {
>> .nmsrs = 1,
>> .pad = 0,
>> .entries = {
>> {.index = MSR_IA32_FEATURE_CONTROL,
>> .reserved = 0,
>> .data = 0}
>> }
>> }
>>
>> The VMLANCH/VMRESUME emulation should be stopped since the CPU is going to
>> reset here. This patch resets the nested_run_pending since the CPU is going
>> to be reset hence there should be nothing pending.
>>
>> Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>> Suggested-by: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
>> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>> Cc: David Hildenbrand <david@xxxxxxxxxx>
>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>> ---
>> v2 -> v3:
>> * move the reset to vmx_leave_nested()
>> v1 -> v2:
>> * cleanup comments format
>>
>> arch/x86/kvm/vmx.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 3b626d6..ab33858 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11107,8 +11107,10 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
>> */
>> static void vmx_leave_nested(struct kvm_vcpu *vcpu)
>> {
>> - if (is_guest_mode(vcpu))
>> + if (is_guest_mode(vcpu)) {
>> + to_vmx(vcpu)->nested.nested_run_pending = 0;
>> nested_vmx_vmexit(vcpu, -1, 0, 0);
>> + }
>> free_nested(to_vmx(vcpu));
>> }
>>
>>
>
> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx>
>
> --
> Thanks,
>
> David

This seems reasonable to me, and it should fix the issue exposed by
syzkaller--though I was never able to reproduce it.

Reviewed-by: Jim Mattson <jmattson@xxxxxxxxxx>