Re: [RFC PATCH] KVM: x86: Allow Qemu/KVM to use PVH entry point

From: Roger Pau Monné
Date: Wed Nov 29 2017 - 03:50:59 EST


On Wed, Nov 29, 2017 at 09:21:59AM +0100, Juergen Gross wrote:
> On 28/11/17 20:34, Maran Wilson wrote:
> > For certain applications it is desirable to rapidly boot a KVM virtual
> > machine. In cases where legacy hardware and software support within the
> > guest is not needed, Qemu should be able to boot directly into the
> > uncompressed Linux kernel binary without the need to run firmware.
> >
> > There already exists an ABI to allow this for Xen PVH guests and the ABI is
> > supported by Linux and FreeBSD:
> >
> > https://xenbits.xen.org/docs/unstable/misc/hvmlite.html

I would also add a link to:

http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,arch-x86,hvm,start_info.h.html#Struct_hvm_start_info

> > This PoC patch enables Qemu to use that same entry point for booting KVM
> > guests.
> >
> > Even though the code is still PoC quality, I'm sending this as an RFC now
> > since there are a number of different ways the specific implementation
> > details can be handled. I chose a shared code path for Xen and KVM guests
> > but could just as easily create a separate code path that is advertised by
> > a different ELF note for KVM. There also seems to be some flexibility in
> > how the e820 table data is passed and how (or if) it should be identified
> > as e820 data. As a starting point, I've chosen the options that seem to
> > result in the smallest patch with minimal to no changes required of the
> > x86/HVM direct boot ABI.
>
> I like the idea.
>
> I'd rather split up the different hypervisor types early and use a
> common set of service functions instead of special casing xen_guest
> everywhere. This would make it much easier to support the KVM PVH
> boot without the need to configure the kernel with CONFIG_XEN.
>
> Another option would be to use the same boot path as with grub: set
> the boot params in zeropage and start at startup_32.

I think I prefer this approach since AFAICT it should allow for
greater code share with the common boot path.

>
> Juergen
>
> > ---
> > arch/x86/xen/enlighten_pvh.c | 74 ++++++++++++++++++++++++++++++++------------
> > 1 file changed, 55 insertions(+), 19 deletions(-)
> >
> > diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
> > index 98ab176..d93f711 100644
> > --- a/arch/x86/xen/enlighten_pvh.c
> > +++ b/arch/x86/xen/enlighten_pvh.c
> > @@ -31,21 +31,46 @@ static void xen_pvh_arch_setup(void)
> > acpi_irq_model = ACPI_IRQ_MODEL_PLATFORM;
> > }
> >
> > -static void __init init_pvh_bootparams(void)
> > +static void __init init_pvh_bootparams(bool xen_guest)
> > {
> > struct xen_memory_map memmap;
> > int rc;
> >
> > memset(&pvh_bootparams, 0, sizeof(pvh_bootparams));
> >
> > - memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
> > - set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
> > - rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
> > - if (rc) {
> > - xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
> > - BUG();
> > + if (xen_guest) {
> > + memmap.nr_entries = ARRAY_SIZE(pvh_bootparams.e820_table);
> > + set_xen_guest_handle(memmap.buffer, pvh_bootparams.e820_table);
> > + rc = HYPERVISOR_memory_op(XENMEM_memory_map, &memmap);
> > + if (rc) {
> > + xen_raw_printk("XENMEM_memory_map failed (%d)\n", rc);
> > + BUG();
> > + }
> > + pvh_bootparams.e820_entries = memmap.nr_entries;
> > + } else if (pvh_start_info.nr_modules > 1) {
> > + /* The second module should be the e820 data for KVM guests */

I don't think this is desirable. You might want to boot other OSes
using this method, and they might want to pass more than one module.

IMHO the hvm_start_info structure should be bumped to contain a
pointer to the memory map. Note that there's a 'version' field that
can be used for that. Even on Xen we might want to pass the memory map
in such a way instead of using the hypercall.

Thanks, Roger.