Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept

From: Christian Borntraeger
Date: Thu Sep 24 2020 - 02:37:56 EST




On 24.09.20 00:45, Sean Christopherson wrote:
> This series introduces a concept we've discussed a few times in x86 land.
> The crux of the problem is that x86 has a few cases where KVM could
> theoretically encounter a software or hardware bug deep in a call stack
> without any sane way to propagate the error out to userspace.
>
> Another use case would be for scenarios where letting the VM live will
> do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> enabling as botching anything related to secure paging all but guarantees
> there will be a flood of WARNs and error messages because lower level PTE
> operations will fail if an upper level operation failed.
>
> The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> to userspace, and mark the VM as bugged so that no ioctls() can be issued
> on the VM or its devices/vCPUs.
>
> RFC as I've done nowhere near enough testing to verify that rejecting the
> ioctls(), evicting running vCPUs, etc... works as intended.

I like the idea. Especially when we add a common "understanding" in QEMU
across all platforms. That would then even allow to propagate an error.
>
> Sean Christopherson (3):
> KVM: Export kvm_make_all_cpus_request() for use in marking VMs as
> bugged
> KVM: Add infrastructure and macro to mark VM as bugged
> KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
> VM
>
> arch/x86/kvm/svm/svm.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++--------
> arch/x86/kvm/x86.c | 4 ++++
> include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++--------
> virt/kvm/kvm_main.c | 11 +++++-----
> 5 files changed, 61 insertions(+), 24 deletions(-)
>