Re: [PATCH] KVM: hyper-v: Add new exit reason HYPERV_OVERLAY
From: Siddharth Chandrasekaran
Date: Mon Apr 26 2021 - 17:42:07 EST
On Fri, Apr 23, 2021 at 12:18:31PM +0200, Alexander Graf wrote:
> On 23.04.21 12:15, Paolo Bonzini wrote:
> > On 23/04/21 11:58, Alexander Graf wrote:
> > > > In theory userspace doesn't know how KVM wishes to implement the
> > > > hypercall page, especially if Xen hypercalls are enabled as well.
> > >
> > > I'm not sure I agree with that sentiment :). User space is the one that
> > > sets the xen compat mode. All we need to do is declare the ORing as part
> > > of the KVM ABI. Which we effectively are doing already, because it's
> > > part of the ABI to the guest, no?
> >
> > Good point. But it may change in the future based on KVM_ENABLE_CAP or
> > whatever, and duplicating code between userspace and kernel is ugly. We
> > already have too many unwritten conventions around CPUID, MSRs, get/set
> > state ioctls, etc.
>
> Yes, I agree. So we can just declare that there won't be any changes to the
> hcall page in-kernel handling code going forward, no? :)
>
> If you want to support a new CAP, support an actual overlay page first - and
> thus actually respect the TLFS.
>
> > That said, this definitely tilts the balance against adding an ioctl to
> > write the hypercall page contents. Userspace can either use the
> > KVM_SET_MSR or assemble it on its own, and one of the two should be okay.
>
> Sounds great. And in the future if we need to move the Xen offset, we should
> rather make the Xen offsetting a parameter from user space.
Okay, assembling the hypercall page contents in user space is possible
but doesn't help us much:
1. It is best to keep the instruction patching at one place; the
kernel is already doing it (which we cannot remove).
2. It is not possible to assemble all overlay pages in user space. For
instance, we cannot assemble the VP assist page. The hypercall code
page is really a special case.
So I'd side with the KVM_SET_MSR approach and have a convention that all
overlay page requests would be trapped to user space first - where the
page get overlaid - and then user space forwards the MSR write to kernel
so it can do a kvm_vcpu_write_guest() if needed. IMO, this allows best
flexibility.
~ Sid.
Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879