Re: [PATCH 1/3 V7] KVM, SEV: Add support for SEV intra host migration
From: Marc Orr
Date: Thu Sep 09 2021 - 23:41:34 EST
On Thu, Sep 9, 2021 at 6:40 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
>
> On Thu, Sep 09, 2021, Marc Orr wrote:
> > > > +int svm_vm_migrate_from(struct kvm *kvm, unsigned int source_fd)
> > > > +{
> > > > + struct kvm_sev_info *dst_sev = &to_kvm_svm(kvm)->sev_info;
> > > > + struct file *source_kvm_file;
> > > > + struct kvm *source_kvm;
> > > > + int ret;
> > > > +
> > > > + ret = svm_sev_lock_for_migration(kvm);
> > > > + if (ret)
> > > > + return ret;
> > > > +
> > > > + if (!sev_guest(kvm) || sev_es_guest(kvm)) {
> > > > + ret = -EINVAL;
> > > > + pr_warn_ratelimited("VM must be SEV enabled to migrate to.\n");
> > >
> > > Linux generally doesn't log user errors to dmesg. They can be helpful during
> > > development, but aren't actionable and thus are of limited use in production.
> >
> > Ha. I had suggested adding the logs when I reviewed these patches
> > (maybe before Peter posted them publicly). My rationale is that if I'm
> > looking at a crash in production, and all I have is a stack trace and
> > the error code, then I can narrow the failure down to this function,
> > but once the function starts returning the same error code in multiple
> > places now it's non-trivial for me to deduce exactly which condition
> > caused the crash. Having these logs makes it trivial. However, if this
> > is not the preferred Linux style then so be it.
>
> I don't necessarily disagree, but none of these errors conditions should so much
> as sniff production. E.g. if userspace invokes this on a !KVM fd or on a non-SEV
> source, or before guest_state_protected=true, then userspace has bigger problems.
> Ditto if the dest isn't actual KVM VM or doesn't meet whatever SEV-enabled/disabled
> criteria we end up with.
>
> The mismatch in online_vcpus is the only one where I could reasonablly see a bug
> escaping to production, e.g. due to an orchestration layer mixup.
>
> For all of these conditions, userspace _must_ be aware of the conditions for success,
> and except for guest_state_protected=true, userspace has access to what state it
> sent into KVM, e.g. it shouldn't be difficult for userspace dump the relevant bits
> from the src and dst without any help from the kernel.
>
> If userspace really needs kernel help to differentiate what's up, I'd rather use
> more unique errors for online_cpus and guest_state_protected, e.g. -E2BIG isn't
> too big of a strecth for the online_cpus mismatch.
SGTM, thanks.