Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry

From: Luis R. Rodriguez
Date: Wed Apr 13 2016 - 16:41:03 EST


On Wed, Apr 13, 2016 at 02:56:29PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote:
> > On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote:
> >
> > > What would be gained by using the same entry but having two different boot
> > > paths after it?
> >
> > Its a good question. In summary for me it would be the push for sharing more
> > code and the push for semantics on early boot to address differences
> > proactively, and ultimately it may enable us to help bring closer the old PV
> > boot path closer.
>
> But why? We want to kill PV (eventually).

Yeah yeah, but its still there, and we'll have to live with it for
at least minimum 5 years I hear. Part of my interest is to see to it
that this path gets less disruption and issues, and we also address
dead code issues which pvops simply folded under the rug. The dead code
concerns may exist still for hvmlite, so unless someone is willing
to make a bold claim there is none, its something to consider.

How we address semantics then is *very* important to me.

> > I'll elaborate on this but first let's clarify why a new entry is used for
> > HVMlite to start of with:
> >
> > 1) Xen ABI has historically not wanted to set up the boot params for Linux
> > guests, instead it insists on letting the Linux kernel Xen boot stubs fill
> > that out for it. This sticking point means it has implicated a boot stub.
>
>
> Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'.

It can still be OS agnostic and pass on type and custom data pointer.

Would that be reasonable ?

> > The HVMLite boot entry tries to bring the boot entries paths closer as it
> > leverages more of the HVM boot path philosophy to mimic the regular PC boot
> > path.
> >
> > Is HVMLite supposed to support legacy PV guests as well BTW ?
>
> Gosh no.

Interesting.. and *everyone* is happy about this?

> > Reason I'm highlighting Xen ABI as a *reason* alone is that even with
> > today's large discrepancy on the old PV boot path I believe we can
> > bring together the boot paths closer together if the Xen ABI was slightly
> > flexible about this, I've highlighted how I believe that is possible before,
>
> <runs away screaming>

Everyone has. If you need to support old PV guests for more than 5 years the
work I'm doing should help with that. I'm trying to leverage gains of the
work I'm doing for HVMLite, and part of this is trying to address semantics
proactively.

> > *iff* the Xen ABI would at the very least set 2 things only:
> >
> > a) Hypervisor type
> > b) A custom data pointer
> >
> > This would enable a single boot entry on the guest to handle then:
> >
> > Pseudo code:
> >
> > startup_32() startup_64()
> > | |
> > | |
> > V V
> > pre_hypervisor_stub_32() pre_hypervisor_stub_64()
> > | |
> > | |
> > V V
> > [existing startup_32()] [existing startup_64()]
> > | |
> > | |
> > V V
> > post_hypervisor_stub_32() post_hypervisor_stub_64()
> >
> >
> > If the Xen ABI was flexible about setting a hypervisor type and custom
> > data pointer then we would haven handlers for it, and in it, it can
> > do whatever it thinks is needed for its own guest types. It could
> > also continue to set the zero page on its own as it sees fit.
> >
> > Again, note that if this is done it could also mean even bringing together
> > the old PV boot path closer together... so this is not just a prospect
> > for HVMLite but also for old PV guests.
> >
> > 2) Because of 1) it has meant we have no formal semantics for early boot
> > code is available and so severe differences can best be addressed also
> > by yet another boot entry. This has meant often times not addressing
>
> There are semantics written for this new code: http://xenbits.xen.org/docs/unstable/misc/hvmlite.html

That only addressed semantics for early boot code implicitly through a new entry...

> All other ones related to low-level operations are described in Intel SDM.
>
>
> > or not knowing if we've addressed real differences between the different
> > entries. Case in point, dead code [0]. How do we know we will not run
> > certain code that should not run for the different entries ? Without
> > *any* semantics later in boot code to distinguish where we came from
> > and because we strive to build single kernels with different possible
> > run time environments it means we have tons of code available to
> > execute / run that we may not need.
>
> I am not following that. PVH aka HVMLite will pretty much erase the need for the
> pvops.

It does not mean there are no dead code concerns with HVMlite.

> >
> > Because of the lack of semantics we may still have dead code prospects
> > with the new HVMLite entry. How are we sure there is no differences ?
> >
> > [0] http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
> >
> > 3) Unikernel / other OS requirements: this is really tied to 2) but even if
> > we tried to evolve the Xen ABI it would mean considering existing solutions
> > out there. Things to consider as an example: FreeBSD doesn't have an EFI
> > entry, unikernels want a simple boot entry.
> >
> > With this in mind then, that I can think of:
> >
> > Cons of using the same entry but having two different boot paths:
> >
> > * Pushes the Xen ABI, needs to make everyone happy, this is hard
> > * Perhaps harder to implement
> >
> > Gains of striving to use the same entry but having two different boot:
> >
> > * Helps to share more code easily
> > * Reduce attack surface
> > * Requires us to have semantics for early boot; this has a series of
> > side benefits:
> > - Means you should try to address differences explicitly rather than
> > implicitly -- case in point Dead Code
> >
> > > You still need a way to distinguish between bare metal
> > > EFI and HVMlite.
> >
> > Great point! This is the semantics aspect. The new entry for HVMlite approach
> > deals with this by making the differences implicit by the new entry point.
> > My call for addressing this through a hypervisor type was to see if we can
> > get those semantics added explicitly so we can also later address dead
> > code concerns for the new HVMLite guest type.
>
> Right, they are..

There is huge merit to address a huge chunks of dead code concerns by sticking
more closer to the native booth paths, it doesn't mean you still have no
dead code concerns with HVMlite, nor that HVMLite has no platform quirks,
it does and part of some recent work is to pave a *clean* path for setting
these differences apart.

> > Part of my own interest in an EFI entry here is that EFI could be used to help
> > expand on the semantics in an OS/agnostic form rather than pushing the x86 boot
> > protocol further. That seems to have its own set of drawbacks though.
> >
> >
> > > And Xen needs a way to find out whether a kernel is
> > > supporting HVMlite to boot it in the correct mode.
> >
> > How was Xen going to find out if new kernels had HVMlite support with the
> > new entry ? An ELFNOTE() ? If an entry is shared could we note use an
>
> Yeah.
> > ELFNOTE() also for this though too ?
>
> Not sure what you mean by 'shared'. But you can add multiple Elf PT_NOTEs.
> See the ELF document.

OK so even if we used a common/shared entry point we can address letting
Xen find out whether or not a kernel supports HVMlite.

Luis