Re: [RFC PATCH] Add Hyperv extended hypercall support in KVM

From: Vipin Sharma
Date: Fri Oct 21 2022 - 17:52:38 EST


On Fri, Oct 21, 2022 at 1:13 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Fri, Oct 21, 2022, Vipin Sharma wrote:
> > Hyperv hypercalls above 0x8000 are called as extended hypercalls as per
> > Hyperv TLFS. Hypercall 0x8001 is used to enquire about available
> > hypercalls by guest VMs.
> >
> > Add support for HvExtCallQueryCapabilities (0x8001) and
> > HvExtCallGetBootZeroedMemory (0x8002) in KVM.
> >
> > A guest VM finds availability of HvExtCallQueryCapabilities (0x8001) by
> > using CPUID.0x40000003.EBX BIT(20). If the bit is set then the guest VM
> > make hypercall HvExtCallQueryCapabilities (0x8001) to know what all
> > extended hypercalls are supported by hypervisor.
> >
> > A userspace VMM can query capability KVM_CAP_HYPERV_EXT_CALL_QUERY to
> > know which extended hypercalls are supported in KVM. After which the
> > userspace will enable capabilities for the guest VM.
> >
> > HvExtCallQueryCapabilities (0x8001) is handled by KVM in kernel,
>
> Does this really need to be handle by KVM? I assume this is a rare operation,
> e.g. done once during guest boot, so performance shouldn't be a concern. To
> avoid breaking existing userspace, KVM can forward HV_EXT_CALL_GET_BOOT_ZEROED_MEMORY
> to userspace if and only if HV_ENABLE_EXTENDED_HYPERCALLS is enabled in CPUID,
> but otherwise KVM can let userspace deal with the "is this enabled" check.

There are 4 more extended hypercalls mentioned in TLFS but there is no
detail about them in the document. From the linux source code one of
the hypercall HvExtCallMemoryHeatHint (0x8003) is a repetitive call.
In the file drivers/hv/hv_balloon.c
status = hv_do_rep_hypercall(HV_EXT_CALL_MEMORY_HEAT_HINT,
nents, 0, hint, NULL);

This makes me a little bit wary that these hypercalls or any future
hypercalls can have high calling frequency by Windows guest. Also, it
is not clear which calls can or cannot be satisfied by userspace
alone.

So, I am not sure if the default exit to userspace for all of the
extended hypercalls will be future proof, therefore, I went with the
approach of only selectively exiting to userspace based on hypercall.

>
> Aha! And if KVM "allows" all theoretically possible extended hypercalls, then
> KVM will never need a capability to announce "support" for a new hypercall, i.e.
> define KVM's ABI to be that KVM punts all possible extended hypercalls to userspace
> if CPUID.0x40000003.EBX BIT(20) is enabled.
>
> > whereas, HvExtCallGetBootZeroedMemory (0x8002) is passed to userspace
> > for further action.
> >
> > Change-Id: Ib3709fadbf11f91be2842c8486bcbe755e09cbea
>
> Drop gerrit's Change-Id when posting publicly.
>
> If KVM punts the support checks to userspace, then the KVM side of things is very
> minimal and future proof (unless Microsoft hoses us). E.g. with code deduplication
> that should be moved to a prep patch:
>
> ---
> arch/x86/kvm/hyperv.c | 43 +++++++++++++++++++++++++++----------------
> 1 file changed, 27 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 0adf4a437e85..f9253249de00 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2138,6 +2138,12 @@ static void kvm_hv_hypercall_read_xmm(struct kvm_hv_hcall *hc)
> kvm_fpu_put();
> }
>
> +/*
> + * The TLFS carves out 64 possible extended hypercalls, numbered sequentially
> + * after the base capabilities extended hypercall.
> + */
> +#define HV_EXT_CALL_MAX (HV_EXT_CALL_QUERY_CAPABILITIES + 64)
> +
> static bool hv_check_hypercall_access(struct kvm_vcpu_hv *hv_vcpu, u16 code)
> {
> if (!hv_vcpu->enforce_cpuid)
> @@ -2178,6 +2184,10 @@ static bool hv_check_hypercall_access(struct kvm_vcpu_hv *hv_vcpu, u16 code)
> case HVCALL_SEND_IPI:
> return hv_vcpu->cpuid_cache.enlightenments_eax &
> HV_X64_CLUSTER_IPI_RECOMMENDED;
> + case HV_EXT_CALL_QUERY_CAPABILITIES ... HV_EXT_CALL_MAX:
> + return hv_vcpu->cpuid_cache.features_ebx &
> + HV_ENABLE_EXTENDED_HYPERCALLS;
> + break;
> default:
> break;
> }
> @@ -2270,14 +2280,7 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
> ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
> break;
> }
> - vcpu->run->exit_reason = KVM_EXIT_HYPERV;
> - vcpu->run->hyperv.type = KVM_EXIT_HYPERV_HCALL;
> - vcpu->run->hyperv.u.hcall.input = hc.param;
> - vcpu->run->hyperv.u.hcall.params[0] = hc.ingpa;
> - vcpu->run->hyperv.u.hcall.params[1] = hc.outgpa;
> - vcpu->arch.complete_userspace_io =
> - kvm_hv_hypercall_complete_userspace;
> - return 0;
> + goto hypercall_userspace_exit;
> case HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST:
> if (unlikely(hc.var_cnt)) {
> ret = HV_STATUS_INVALID_HYPERCALL_INPUT;
> @@ -2336,15 +2339,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
> ret = HV_STATUS_OPERATION_DENIED;
> break;
> }
> - vcpu->run->exit_reason = KVM_EXIT_HYPERV;
> - vcpu->run->hyperv.type = KVM_EXIT_HYPERV_HCALL;
> - vcpu->run->hyperv.u.hcall.input = hc.param;
> - vcpu->run->hyperv.u.hcall.params[0] = hc.ingpa;
> - vcpu->run->hyperv.u.hcall.params[1] = hc.outgpa;
> - vcpu->arch.complete_userspace_io =
> - kvm_hv_hypercall_complete_userspace;
> - return 0;
> + goto hypercall_userspace_exit;
> }
> + case HV_EXT_CALL_QUERY_CAPABILITIES ... HV_EXT_CALL_MAX:
> + if (unlikely(hc.fast)) {
> + ret = HV_STATUS_INVALID_PARAMETER;
> + break;
> + }
> + goto hypercall_userspace_exit;
> default:
> ret = HV_STATUS_INVALID_HYPERCALL_CODE;
> break;
> @@ -2352,6 +2354,14 @@ int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
>
> hypercall_complete:
> return kvm_hv_hypercall_complete(vcpu, ret);
> +hypercall_userspace_exit:
> + vcpu->run->exit_reason = KVM_EXIT_HYPERV;
> + vcpu->run->hyperv.type = KVM_EXIT_HYPERV_HCALL;
> + vcpu->run->hyperv.u.hcall.input = hc.param;
> + vcpu->run->hyperv.u.hcall.params[0] = hc.ingpa;
> + vcpu->run->hyperv.u.hcall.params[1] = hc.outgpa;
> + vcpu->arch.complete_userspace_io = kvm_hv_hypercall_complete_userspace;
> + return 0;
> }
>
> void kvm_hv_init_vm(struct kvm *kvm)
> @@ -2494,6 +2504,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>
> ent->ebx |= HV_POST_MESSAGES;
> ent->ebx |= HV_SIGNAL_EVENTS;
> + ent->ebx |= HV_ENABLE_EXTENDED_HYPERCALLS;
>
> ent->edx |= HV_X64_HYPERCALL_XMM_INPUT_AVAILABLE;
> ent->edx |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
>
> base-commit: e18d6152ff0f41b7f01f9817372022df04e0d354
> --
>