Re: [RFC PATCH v2 3/3] KVM: x86: add vCPU ioctl for HLT exits disable capability
From: Sean Christopherson
Date: Mon Jan 10 2022 - 15:56:29 EST
On Tue, Dec 21, 2021, Kechen Lu wrote:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d5d0d99b584e..d7b4a3e360bb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5072,6 +5072,18 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
> kvm_update_pv_runtime(vcpu);
>
> return 0;
> +
> + case KVM_CAP_X86_DISABLE_EXITS:
> + if (cap->args[0] && (cap->args[0] &
> + ~KVM_X86_DISABLE_VALID_EXITS))
Bad alignment, but there's no need for the !0 in the first place, i.e.
if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
but that's also incomplete as this path only supports toggling HLT, yet allows
all flavors of KVM_X86_DISABLE_VALID_EXITS. Unless there's a good reason to not
allow maniuplating the other exits, the proper fix is to just support everything.
> + return -EINVAL;
> +
> + vcpu->arch.hlt_in_guest = (cap->args[0] &
> + KVM_X86_DISABLE_EXITS_HLT) ? true : false;
Hmm, this behavior diverges from the per-VM ioctl, which doesn't allow re-enabling
a disabled exit. We can't change the per-VM behavior without breaking backwards
compatibility, e.g. if userspace does:
if (allow_mwait)
kvm_vm_disable_exit(KVM_X86_DISABLE_EXITS_MWAIT)
if (allow_hlt)
kvm_vm_disable_exit(KVM_X86_DISABLE_EXITS_HLT)
then changing KVM behavior would result in MWAIT behavior intercepted when previously
it would have been allowed. We have a userspace VMM that operates like this...
Does your use case require toggling intercepts? Or is the configuration static?
If it's static, then the easiest thing would be to follow the per-VM behavior so
that there are no suprises. If toggling is required, then I think the best thing
would be to add a prep patch to add an override flag to the per-VM ioctl, and then
share code between the per-VM and per-vCPU paths for modifying the flags (attached
as patch 0003).
Somewhat related, there's another bug of sorts that I think we can safely fix.
KVM doesn't reject disabling of MWAIT exits when MWAIT isn't allowed in the guest,
and instead ignores the bad input. Not a big deal, but fixing that means KVM
doesn't need to check kvm_can_mwait_in_guest() when processing the args to update
flags. If that breaks someone's userspace, the alternative would be to tweak the
attached patch 0003 to introduce the OVERRIDE, e.g.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f611a49ceba4..3bac756bab79 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5053,6 +5053,8 @@ static int kvm_vcpu_ioctl_device_attr(struct kvm_vcpu *vcpu,
#define kvm_ioctl_disable_exits(a, mask) \
({ \
+ if (!kvm_can_mwait_in_guest()) \
+ (mask) &= KVM_X86_DISABLE_EXITS_MWAIT; \
if ((mask) & KVM_X86_DISABLE_EXITS_OVERRIDE) { \
(a).mwait_in_guest = (mask) & KVM_X86_DISABLE_EXITS_MWAIT; \
(a).hlt_in_guest = (mask) & KVM_X86_DISABLE_EXITS_HLT; \
If toggling is not required, then I still think it makes sense to add a macro to
handle propagating the capability args to the arch flags.