Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

From: Sean Christopherson
Date: Fri Feb 26 2021 - 12:45:36 EST


+Will and Quentin (arm64)

Moving the non-KVM x86 folks to bcc, I don't they care about KVM details at this
point.

On Fri, Feb 26, 2021, Ashish Kalra wrote:
> On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
> > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
> > Thanks for grabbing the data!
> >
> > I am fine with both paths. Sean has stated an explicit desire for
> > hypercall exiting, so I think that would be the current consensus.

Yep, though it'd be good to get Paolo's input, too.

> > If we want to do hypercall exiting, this should be in a follow-up
> > series where we implement something more generic, e.g. a hypercall
> > exiting bitmap or hypercall exit list. If we are taking the hypercall
> > exit route, we can drop the kvm side of the hypercall.

I don't think this is a good candidate for arbitrary hypercall interception. Or
rather, I think hypercall interception should be an orthogonal implementation.

The guest, including guest firmware, needs to be aware that the hypercall is
supported, and the ABI needs to be well-defined. Relying on userspace VMMs to
implement a common ABI is an unnecessary risk.

We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but
require further VMM intervention. But, I just don't see the point, it would
save only a few lines of code. It would also limit what KVM could do in the
future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace,
then mandatory interception would essentially make it impossible for KVM to do
bookkeeping while still honoring the interception request.

However, I do think it would make sense to have the userspace exit be a generic
exit type. But hey, we already have the necessary ABI defined for that! It's
just not used anywhere.

/* KVM_EXIT_HYPERCALL */
struct {
__u64 nr;
__u64 args[6];
__u64 ret;
__u32 longmode;
__u32 pad;
} hypercall;


> > Userspace could also handle the MSR using MSR filters (would need to
> > confirm that). Then userspace could also be in control of the cpuid bit.

An MSR is not a great fit; it's x86 specific and limited to 64 bits of data.
The data limitation could be fudged by shoving data into non-standard GPRs, but
that will result in truly heinous guest code, and extensibility issues.

The data limitation is a moot point, because the x86-only thing is a deal
breaker. arm64's pKVM work has a near-identical use case for a guest to share
memory with a host. I can't think of a clever way to avoid having to support
TDX's and SNP's hypervisor-agnostic variants, but we can at least not have
multiple KVM variants.

> > Essentially, I think you could drop most of the host kernel work if
> > there were generic support for hypercall exiting. Then userspace would
> > be responsible for all of that. Thoughts on this?

> So if i understand it correctly, i will submitting v11 of this patch-set
> with in-kernel support for page encryption status hypercalls and shared
> pages list and the userspace control of SEV live migration feature
> support and fixes for MSR handling.

At this point, I'd say hold off on putting more effort into an implementation
until we have consensus.

> In subsequent follow-up patches we will add generic support for hypercall
> exiting and then drop kvm side of hypercall and also add userspace
> support for MSR handling.
>
> Thanks,
> Ashish