Re: [PATCH v14 22/22] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event

From: Sean Christopherson
Date: Wed Apr 24 2024 - 20:10:58 EST


On Sun, Apr 21, 2024, Michael Roth wrote:
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 85099198a10f..6cf186ed8f66 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -7066,6 +7066,7 @@ values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
> struct kvm_user_vmgexit {
> #define KVM_USER_VMGEXIT_PSC_MSR 1
> #define KVM_USER_VMGEXIT_PSC 2
> + #define KVM_USER_VMGEXIT_EXT_GUEST_REQ 3

Assuming we can't get massage this into a vendor agnostic exit, there's gotta be
a better name than EXT_GUEST_REQ, which is completely meaningless to me and probably
most other people that aren't intimately familar with the specs. Request what?

> __u32 type; /* KVM_USER_VMGEXIT_* type */
> union {
> struct {
> @@ -3812,6 +3813,84 @@ static void snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp
> ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
> }
>
> +static int snp_complete_ext_guest_req(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> + struct vmcb_control_area *control;
> + struct kvm *kvm = vcpu->kvm;
> + sev_ret_code fw_err = 0;
> + int vmm_ret;
> +
> + vmm_ret = vcpu->run->vmgexit.ext_guest_req.ret;
> + if (vmm_ret) {
> + if (vmm_ret == SNP_GUEST_VMM_ERR_INVALID_LEN)
> + vcpu->arch.regs[VCPU_REGS_RBX] =
> + vcpu->run->vmgexit.ext_guest_req.data_npages;
> + goto abort_request;
> + }
> +
> + control = &svm->vmcb->control;
> +
> + /*
> + * To avoid the message sequence number getting out of sync between the
> + * actual value seen by firmware verses the value expected by the guest,
> + * make sure attestations can't get paused on the write-side at this
> + * point by holding the lock for the entire duration of the firmware
> + * request so that there is no situation where SNP_GUEST_VMM_ERR_BUSY
> + * would need to be returned after firmware sees the request.
> + */
> + mutex_lock(&snp_pause_attestation_lock);

Why is this in KVM? IIUC, KVM only needs to get involved to translate GFNs to
PFNs, the rest can go in the sev-dev driver, no? The whole split is weird,
seemingly due to KVM "needing" to take this lock. E.g. why is core kernel code
in arch/x86/virt/svm/sev.c at all dealing with attestation goo, when pretty much
all of the actual usage is (or can be) in sev-dev.c

> +
> + if (__snp_transaction_is_stale(svm->snp_transaction_id))
> + vmm_ret = SNP_GUEST_VMM_ERR_BUSY;
> + else if (!__snp_handle_guest_req(kvm, control->exit_info_1,
> + control->exit_info_2, &fw_err))
> + vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> +
> + mutex_unlock(&snp_pause_attestation_lock);
> +
> +abort_request:
> + ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, SNP_GUEST_ERR(vmm_ret, fw_err));
> +
> + return 1; /* resume guest */
> +}
> +
> +static int snp_begin_ext_guest_req(struct kvm_vcpu *vcpu)
> +{
> + int vmm_ret = SNP_GUEST_VMM_ERR_GENERIC;
> + struct vcpu_svm *svm = to_svm(vcpu);
> + unsigned long data_npages;
> + sev_ret_code fw_err;
> + gpa_t data_gpa;
> +
> + if (!sev_snp_guest(vcpu->kvm))
> + goto abort_request;
> +
> + data_gpa = vcpu->arch.regs[VCPU_REGS_RAX];
> + data_npages = vcpu->arch.regs[VCPU_REGS_RBX];
> +
> + if (!IS_ALIGNED(data_gpa, PAGE_SIZE))
> + goto abort_request;
> +
> + svm->snp_transaction_id = snp_transaction_get_id();
> + if (snp_transaction_is_stale(svm->snp_transaction_id)) {

Why bother? I assume this is rare, so why not handle it on the backend, i.e.
after userspace does its thing? Then KVM doesn't even have to care about
checking for stale IDs, I think?

None of this makes much sense to my eyes, e.g. AFAICT, the only thing that can
pause attestation is userspace, yet the kernel is responsible for tracking whether
or not a transaction is fresh?