Re: [PATCH] KVM: SVM: Fix SNP AP destroy race with VMRUN
From: Tom Lendacky
Date: Mon Mar 17 2025 - 13:28:34 EST
On 3/17/25 12:20, Tom Lendacky wrote:
> An AP destroy request for a target vCPU is typically followed by an
> RMPADJUST to remove the VMSA attribute from the page currently being
> used as the VMSA for the target vCPU. This can result in a vCPU that
> is about to VMRUN to exit with #VMEXIT_INVALID.
>
> This usually does not happen as APs are typically sitting in HLT when
> being destroyed and therefore the vCPU thread is not running at the time.
> However, if HLT is allowed inside the VM, then the vCPU could be about to
> VMRUN when the VMSA attribute is removed from the VMSA page, resulting in
> a #VMEXIT_INVALID when the vCPU actually issues the VMRUN and causing the
> guest to crash. An RMPADJUST against an in-use (already running) VMSA
> results in a #NPF for the vCPU issuing the RMPADJUST, so the VMSA
> attribute cannot be changed until the VMRUN for target vCPU exits. The
> Qemu command line option '-overcommit cpu-pm=on' is an example of allowing
> HLT inside the guest.
>
> Use kvm_test_request() to ensure that the target vCPU sees the AP destroy
> request before returning to the initiating vCPU.
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
Sean,
If you're ok with this approach for the fix, this patch may need to be
adjusted given your series around AP creation fixes, unless you want to
put this as an early patch in your series. Let me know what you'd like
to do.
Thanks,
Tom
> ---
> arch/x86/kvm/svm/sev.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 0d898d6b697f..a040f29bb07b 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4071,6 +4071,16 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> if (kick) {
> kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
> kvm_vcpu_kick(target_vcpu);
> +
> + if (request == SVM_VMGEXIT_AP_DESTROY) {
> + /*
> + * A destroy is likely to be followed by an RMPADJUST
> + * that will remove the VMSA flag, so be sure the vCPU
> + * got the request in case it is on the way to a VMRUN.
> + */
> + while (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu))
> + cond_resched();
> + }
> }
>
> mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
>
> base-commit: f8d892c137f7448d7b49f5e3ad7aa7b5a48a64ed