Re: kvm: WARNING in nested_vmx_vmexit

From: Radim KrÄmÃÅ
Date: Wed Mar 01 2017 - 13:40:20 EST


2017-03-01 10:44+0100, Dmitry Vyukov:
> On Wed, Mar 1, 2017 at 7:13 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>> 2017-02-28 20:15 GMT+08:00 Dmitry Vyukov <dvyukov@xxxxxxxxxx>:
>>> Hello,
>>>
>>> The following program triggers WARNING in nested_vmx_vmexit:
>>> https://gist.githubusercontent.com/dvyukov/16b946d7dc703bb07b9b933f12fb8a6e/raw/dac60506feb8dd9dd22828c486e46ee8a5e30f13/gistfile1.txt
>>>
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029
>>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
>>> CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>> Call Trace:
>>> __dump_stack lib/dump_stack.c:15 [inline]
>>> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>> panic+0x1fb/0x412 kernel/panic.c:179
>>> __warn+0x1c4/0x1e0 kernel/panic.c:540
>>> warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
>>> nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
>>> vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline]
>>> vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324
>>> kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099
>>> do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128
>>> __msr_io arch/x86/kvm/x86.c:2577 [inline]
>>> msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614
>>> kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497
>>> kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721
>>> vfs_ioctl fs/ioctl.c:43 [inline]
>>> do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
>>> SYSC_ioctl fs/ioctl.c:698 [inline]
>>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>> entry_SYSCALL_64_fastpath+0x1f/0xc2
>>> RIP: 0033:0x451229
>>> RSP: 002b:00007fc1e7ebec98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000451229
>>> RDX: 0000000020aecfe8 RSI: 000000004008ae89 RDI: 0000000000000008
>>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>>> R13: 0000000000000000 R14: 00007fc1e7ebf9c0 R15: 00007fc1e7ebf700
>>>
>>>
>>> On commit e5d56efc97f8240d0b5d66c03949382b6d7e5570
>>
>> I git checkout to this commit for linus tree, however the line number
>> 11029 doesn't match any warning, and there are several warnings in the
>> function nested_vmx_vmexit(), could you point out which one?
>
> Ah sorry, I have some local diff.
> So now I am on 86292b33d4b79ee03e2f43ea0381ef85f077c760 and here is my diff:
> https://gist.githubusercontent.com/dvyukov/6bb88c6e3584c3fd989df01386efcb74/raw/1ae08a3a682ad9467de5eb7fb96d457db514f9d2/gistfile1.txt
>
> The warning is:
>
> /* trying to cancel vmlaunch/vmresume is a bug */
> WARN_ON_ONCE(vmx->nested.nested_run_pending);

I can't reproduce this one, but it is there:

The warning gets thrown while doing

(*(uint32_t*)0x20aecfe8 = (uint32_t)0x1);
(*(uint32_t*)0x20aecfec = (uint32_t)0x0);
(*(uint32_t*)0x20aecff0 = (uint32_t)0x3a);
(*(uint32_t*)0x20aecff4 = (uint32_t)0x0);
(*(uint64_t*)0x20aecff8 = (uint64_t)0x0);
r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul,
0x20aecfe8ul, 0, 0, 0, 0, 0, 0);

i.e. KVM_SET_MSR ioctl with

struct kvm_msrs {
.nmsrs = 1,
.pad = 0,
.entries = {
{.index = MSR_IA32_FEATURE_CONTROL,
.reserved = 0,
.data = 0}
}
}

and the relevant setter is

case MSR_IA32_FEATURE_CONTROL:
if (!vmx_feature_control_msr_valid(vcpu, data) ||
(to_vmx(vcpu)->msr_ia32_feature_control &
FEATURE_CONTROL_LOCKED && !msr_info->host_initiated))
return 1;
vmx->msr_ia32_feature_control = data;
if (msr_info->host_initiated && data == 0)
vmx_leave_nested(vcpu);
break;

The condition is wrong as it is no longer only VMX bits in data, so VMX
can get disabled while the data is nonzero. KVM might not need to force
nested VM exit when userspace goes crazy, so the condition would just go
away ... need to check.