Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry

From: Luis R. Rodriguez
Date: Wed Apr 13 2016 - 18:23:34 EST


On Wed, Apr 13, 2016 at 05:08:01PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Apr 13, 2016 at 10:40:55PM +0200, Luis R. Rodriguez wrote:
> > On Wed, Apr 13, 2016 at 02:56:29PM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote:
> > > > On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote:
> > > >
> > > > > What would be gained by using the same entry but having two different boot
> > > > > paths after it?
> > > >
> > > > Its a good question. In summary for me it would be the push for sharing more
> > > > code and the push for semantics on early boot to address differences
> > > > proactively, and ultimately it may enable us to help bring closer the old PV
> > > > boot path closer.
> > >
> > > But why? We want to kill PV (eventually).
> >
> > Yeah yeah, but its still there, and we'll have to live with it for
> > at least minimum 5 years I hear. Part of my interest is to see to it
> > that this path gets less disruption and issues, and we also address
> > dead code issues which pvops simply folded under the rug. The dead code
> > concerns may exist still for hvmlite, so unless someone is willing
> > to make a bold claim there is none, its something to consider.
>
> What is this dead code you speak of? Is it MTRR? Is early path code
> that PV misses (like KASL or other?)

Kasan is dead code to Xen. If you boot x86 Xen with Kasan enabled
Xen explodes. Quick question, will Kasan not explode with HVMLite ?

MTRR used to be dead code concern but since we have vetted most of that code
now we are pretty certain that code should never run now.

KASLR may be -- not sure as I haven't vetted that, but from
what I have loosely heard maybe.

VGA code will be dead code for HVMlite for sure as the design doc
says it will not run VGA, the ACPI flag will be set but the check
for that is not yet on Linux. That means the VGA Linux code will
be there but we have no way to ensure it will not run nor that
anything will muck with it.

To be clear -- dead code concerns still exist even without
virtualization solutions, its just that with virtualization
this stuff comes up more and there has been no proactive
measures to address this. The question of semantics here is
to see to what extent we need earlier boot code annotations
to ensure we address semantics proactively.

> The entrace point in Linux "proper" is startup_32 or startup_64 - the same
> path that EFI uses.
>
> If you were to draw this (very simplified):
>
> a)- GRUB2 ---------------------\ (creates an bootparam structure)
> \
> +---- startup_32 or startup_64
> b) EFI -> Linux EFI stub -------/
> (creates bootparm) /
> c) GRUB2-EFI -> Linux EFI----/
> stub /
> d) HVMLite ----------------/
> (creates bootparm)

b) and d) might be able to share paths there...
d) still has its own entry, it does more than create boot params.

> (I am not sure about the c) - I would have to look in source to
> be source). There is also LILO in this, but I am not even sure if
> works anymore.
>
>
> What you have is that every entry point creates the bootparams
> and ends up calling startup_X. The startup_64 then hit the rest
> of the kernel. The startp_X code is the one that would setup
> the basic pagetables, segments, etc.

Sure.. a full diagram should include both sides and how when using
a custom entry one runs the risk of skipping a lot of code setup.
There is that and as others have pointed out how certain guests types
are assumed to not have certain peripherals, and we have no idea
to ensure certain old legacy code may not ever run or be accessed
by drivers.

> > How we address semantics then is *very* important to me.
>
> Which semantics? How the CPU is going to be at startup_X ? Or
> how the CPU is going to be when EFI firmware invokes the EFI stub?
> Or when GRUB2 loads Linux?

What hypervisor kicked me and what guest type I am.

Let me elaborate more below.

> That (those bootloaders) is clearly defined. The URL I provided
> mentions the HVMLite one. The Documentation/x86/boot.c mentions
> what the semantics are to expected when providing an bootstrap
> (which is what HVMLitel stub code in Linux would write against -
> and what EFI stub code had been written against too).
> >
> > > > I'll elaborate on this but first let's clarify why a new entry is used for
> > > > HVMlite to start of with:
> > > >
> > > > 1) Xen ABI has historically not wanted to set up the boot params for Linux
> > > > guests, instead it insists on letting the Linux kernel Xen boot stubs fill
> > > > that out for it. This sticking point means it has implicated a boot stub.
> > >
> > >
> > > Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'.
> >
> > It can still be OS agnostic and pass on type and custom data pointer.
>
> Sure. It has that (it MUST otherwise how else would you pass data).
> It is documented as well http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,xen.h.html#incontents_startofday
> (see " Start of day structure passed to PVH guests in %ebx.")

The design doc begs for a custom OS entry point though.
If we had a single 'type' and 'custom data' passed to the kernel that
should suffice for the default Linux entry point to just pivot off
of that and do what it needs without more entry points. Once.

Luis