Re: [PATCH 10/12] KVM: SEV: Prohibit migration of a VM that has mirrors

From: Sean Christopherson
Date: Mon Nov 29 2021 - 18:12:55 EST


On Mon, Nov 22, 2021, Paolo Bonzini wrote:
> VMs that mirror an encryption context rely on the owner to keep the
> ASID allocated. Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
> would cause a dangling ASID:
>
> 1. copy context from A to B (gets ref to A)
> 2. move context from A to L (moves ASID from A to L)
> 3. close L (releases ASID from L, B still references it)
>
> The right way to do the handoff instead is to create a fresh mirror VM
> on the destination first:
>
> 1. copy context from A to B (gets ref to A)
> [later] 2. close B (releases ref to A)
> 3. move context from A to L (moves ASID from A to L)
> 4. copy context from L to M
>
> So, catch the situation by adding a count of how many VMs are
> mirroring this one's encryption context.
>
> Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration")
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/sev.c | 22 ++++++++++-
> arch/x86/kvm/svm/svm.h | 1 +
> .../selftests/kvm/x86_64/sev_migrate_tests.c | 37 +++++++++++++++++++
> 3 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 025d9731b66c..89a716290fac 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1696,6 +1696,16 @@ int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> }
>
> src_sev = &to_kvm_svm(source_kvm)->sev_info;
> +
> + /*
> + * VMs mirroring src's encryption context rely on it to keep the
> + * ASID allocated, but below we are clearing src_sev->asid.

Prefer something to explan why this is disallowed instead of simply saying the
src's ASID is cleared/released, e.g.

/*
* Disallow migrating a VM with active mirrors, as the mirrors rely on
* the VM to keep the ASID allocated. Transferring all mirrors to dst
* would require locking all mirrors, and there's no known use case for
* intra-host migration of a VM with mirrors.
*/

> + */
> + if (src_sev->num_mirrored_vms) {
> + ret = -EBUSY;
> + goto out_unlock;
> + }
> +
> dst_sev->misc_cg = get_current_misc_cg();
> cg_cleanup_sev = dst_sev;
> if (dst_sev->misc_cg != src_sev->misc_cg) {
> @@ -1987,6 +1997,7 @@ int svm_vm_copy_asid_from(struct kvm *kvm, unsigned int source_fd)
> */
> source_sev = &to_kvm_svm(source_kvm)->sev_info;
> kvm_get_kvm(source_kvm);
> + source_sev->num_mirrored_vms++;
>
> /* Set enc_context_owner and copy its encryption context over */
> mirror_sev = &to_kvm_svm(kvm)->sev_info;
> @@ -2019,12 +2030,21 @@ void sev_vm_destroy(struct kvm *kvm)
> struct list_head *head = &sev->regions_list;
> struct list_head *pos, *q;
>
> + WARN_ON(sev->num_mirrored_vms);
> +
> if (!sev_guest(kvm))
> return;
>
> /* If this is a mirror_kvm release the enc_context_owner and skip sev cleanup */
> if (is_mirroring_enc_context(kvm)) {
> - kvm_put_kvm(sev->enc_context_owner);
> + struct kvm *owner_kvm = sev->enc_context_owner;
> + struct kvm_sev_info *owner_sev = &to_kvm_svm(owner_kvm)->sev_info;
> +
> + mutex_lock(&owner_kvm->lock);
> + if (!WARN_ON(!owner_sev->num_mirrored_vms))

Why not make num_mirrored_vms an atomic_t so that the destruction path doesn't
need to take owner_kvm->lock? The asymmetry is a bit odd, but this feels worse
in its own way.

And since this is effectively a refcount, I almost wonder if it would make sense
to do:

if (!refcount_inc_not_zero(&source_sev->num_mirrored_vms)) {
refcount_set(&source_sev->num_mirrored_vms, 1);
kvm_get_kvm(source_kvm);
}

and then

if (refcount_dec_and_test(&source_sev->num_mirrored_vms))
kvm_put_kvm(owner_kvm);

to "officially" refcount something.


> + owner_sev->num_mirrored_vms--;
> + mutex_unlock(&owner_kvm->lock);
> + kvm_put_kvm(owner_kvm);
> return;
> }
>
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 5faad3dc10e2..1c7306c370fa 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -79,6 +79,7 @@ struct kvm_sev_info {
> struct list_head regions_list; /* List of registered regions */
> u64 ap_jump_table; /* SEV-ES AP Jump Table address */
> struct kvm *enc_context_owner; /* Owner of copied encryption context */
> + unsigned long num_mirrored_vms; /* Number of VMs sharing this ASID */

Unsigned long is odd. It's guaranteed to be u64 since SEV is 64-bit only. If
it really is possible to overflow a refcount_t/int, than u64 or atomic64_t seems
more appropriate.

> struct misc_cg *misc_cg; /* For misc cgroup accounting */
> atomic_t migration_in_progress;
> };