Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

From: Steve Rutherford
Date: Thu Apr 01 2021 - 21:45:45 EST


On Fri, Mar 19, 2021 at 11:00 AM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
>
> On Thu, Mar 11, 2021 at 12:48:07PM -0800, Steve Rutherford wrote:
> > On Thu, Mar 11, 2021 at 10:15 AM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
> > >
> > > On Wed, Mar 03, 2021 at 06:54:41PM +0000, Will Deacon wrote:
> > > > [+Marc]
> > > >
> > > > On Tue, Mar 02, 2021 at 02:55:43PM +0000, Ashish Kalra wrote:
> > > > > On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote:
> > > > > > On Fri, Feb 26, 2021, Ashish Kalra wrote:
> > > > > > > On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
> > > > > > > > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
> > > > > > > > Thanks for grabbing the data!
> > > > > > > >
> > > > > > > > I am fine with both paths. Sean has stated an explicit desire for
> > > > > > > > hypercall exiting, so I think that would be the current consensus.
> > > > > >
> > > > > > Yep, though it'd be good to get Paolo's input, too.
> > > > > >
> > > > > > > > If we want to do hypercall exiting, this should be in a follow-up
> > > > > > > > series where we implement something more generic, e.g. a hypercall
> > > > > > > > exiting bitmap or hypercall exit list. If we are taking the hypercall
> > > > > > > > exit route, we can drop the kvm side of the hypercall.
> > > > > >
> > > > > > I don't think this is a good candidate for arbitrary hypercall interception. Or
> > > > > > rather, I think hypercall interception should be an orthogonal implementation.
> > > > > >
> > > > > > The guest, including guest firmware, needs to be aware that the hypercall is
> > > > > > supported, and the ABI needs to be well-defined. Relying on userspace VMMs to
> > > > > > implement a common ABI is an unnecessary risk.
> > > > > >
> > > > > > We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but
> > > > > > require further VMM intervention. But, I just don't see the point, it would
> > > > > > save only a few lines of code. It would also limit what KVM could do in the
> > > > > > future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace,
> > > > > > then mandatory interception would essentially make it impossible for KVM to do
> > > > > > bookkeeping while still honoring the interception request.
> > > > > >
> > > > > > However, I do think it would make sense to have the userspace exit be a generic
> > > > > > exit type. But hey, we already have the necessary ABI defined for that! It's
> > > > > > just not used anywhere.
> > > > > >
> > > > > > /* KVM_EXIT_HYPERCALL */
> > > > > > struct {
> > > > > > __u64 nr;
> > > > > > __u64 args[6];
> > > > > > __u64 ret;
> > > > > > __u32 longmode;
> > > > > > __u32 pad;
> > > > > > } hypercall;
> > > > > >
> > > > > >
> > > > > > > > Userspace could also handle the MSR using MSR filters (would need to
> > > > > > > > confirm that). Then userspace could also be in control of the cpuid bit.
> > > > > >
> > > > > > An MSR is not a great fit; it's x86 specific and limited to 64 bits of data.
> > > > > > The data limitation could be fudged by shoving data into non-standard GPRs, but
> > > > > > that will result in truly heinous guest code, and extensibility issues.
> > > > > >
>
> We may also need to pass-through the MSR to userspace, as it is a part of this
> complete host (userspace/kernel), OVMF and guest kernel negotiation of
> the SEV live migration feature.
>
> Host (userspace/kernel) advertises it's support for SEV live migration
> feature via the CPUID bits, which is queried by OVMF and which in turn
> adds a new UEFI runtime variable to indicate support for SEV live
> migration, which is later queried during guest kernel boot and
> accordingly the guest does a wrmrsl() to custom MSR to complete SEV
> live migration negotiation and enable it.
>
> Now, the GET_SHARED_REGION_LIST ioctl returns error, until this MSR write
> enables SEV live migration, hence, preventing userspace to start live
> migration before the feature support has been negotiated and enabled on
> all the three components - host, guest OVMF and kernel.
>
> But, now with this ioctl not existing anymore, we will need to
> pass-through the MSR to userspace too, for it to only initiate live
> migration once the feature negotiation has been completed.
Hey Ashish,

I can't tell if you were waiting for feedback on this before posting
the follow-up patch series.

Here are a few options:
1) Add the MSR explicitly to the list of custom kvm MSRs, but don't
have it hooked up anywhere. The expectation would be for the VMM to
use msr intercepts to handle the reads and writes. If that seems
weird, have svm_set_msr (or whatever) explicitly ignore it.
2) Add a getter and setter for the MSR. Only allow guests to use it if
they are sev_guests with the requisite CPUID bit set.

I think I prefer the former, and it should work fine from my
understanding of the msr intercepts implementation. I'm also open to
other ideas. You could also have the MSR write trigger a KVM_EXIT of
the same type as the hypercall, but have it just say "the msr value
changed to XYZ", but that design sounds awkward.

Thanks,
Steve