Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry

From: Konrad Rzeszutek Wilk
Date: Thu Apr 14 2016 - 22:16:30 EST


On Thu, Apr 14, 2016 at 11:12:01PM +0200, Luis R. Rodriguez wrote:
> On Thu, Apr 14, 2016 at 04:38:47PM -0400, Konrad Rzeszutek Wilk wrote:
> > > This has nothing to do with dominance or anything nefarious, I'm asking
> > > simply for a full engineering evaluation of all possibilities, with
> > > the long term in mind. Not for now, but for hardware assumptions which
> > > are sensible 5 years from now.
> >
> > There are two different things in my mind about this conversation:
> >
> > 1). semantics of low-level code wrapped around pvops. On baremetal
> > it is easy - just look at Intel and AMD SDM.
> > And this is exactly what running in HVM or HVMLite mode will do -
> > all those low-level operations will have the same exact semantic
> > as baremetal.
>
> Today Linux is KVM stupid for early boot code. I've pointed this out

-EPARSE?
> before, but again, there has been no reason found to need this. Perhaps
> for HVMLite we won't need this...

Are you talking about kvmtools? Which BTW are similar to how HVMLite
would expose the platform.
>
> > There is no hope for the pv_ops to fix that.
>
> Actually I beg to differ. See my patches and ongoing work.

I meant in terms of semantics. As in I cannot see some of
those pv-ops to have the same semantics as baremetal. For example
set_pte is simple on x86 (movq $<some value>, <memory address>).

While on Xen PV it is a potential batching hypercall with
lookup in an P2M table, then perhaps a sidelong look at
the M2P, then maybe the M2P override.

>
> > And I am pretty sure the HVMLite in 5 years will have no
> > trouble in this as it will be running in VMX mode (HVM).
>
> HVMLite may still use PV drivers for some things, its not super
> obvious to me that low level semantics will not be needed yet.

PV drivers are very different from low-level semantics.

And it will have to use them.

Maybe it is easier to think of this in terms of kvmtool - it
is pretty much how this would work - but instead of VirtIO
drivers you would be using the Xen PV drivers (thought one
could also use VirtIO ones if you wanted).
>
> > 2). Boot entry.
> >
> > The semantics on Linux are well known - they are documented in
> > Documentation/x86/boot.txt.
> >
> > HVMLite Linux guests have to somehow provide that.
> >
> > And how it is done seems to be tied around:
> >
> > a) Use existing boot paths - which means making some
> > extra stub code to call in those existing boot paths
> > (for example Xen could bundle with an GRUB2-alike
> > code to be run when booting Linux using that boot-path).
> >
> > Or EFI (for a ton more code). Granted not all OSes
> > support those, so not very OS agnostic.
>
> What other OSes do is something to consider but if they don't
> do it because they are slacking in one domain should by no means
> be a reason to not evaluate the long term possible gains.
> Specially if we have reasons to believe more architectures will
> consider it and standardize on it.
>
> It'd be silly not to take this a bit more seriously.

Complexity vs simplicity.
>
> > Hard part - if the bootparams change then have to
> > rev up the code in there. May be out of sync
> > with Linux bootparams.
>
> If we are going to ultimately standardize on EFI boot for new
> hardware it'd be rather silly to extend the boot params further.

Whoa there... Have you spoken to hpa,tglrx about this?

>
> > b) Add another simpler boot entry point which has to copy
> > "some" strings from its format in bootparams.
> >
> >
> > So this part of the discussion does not fall in the
> > hardware assumptions. Intel SDM or AMD mention nothing about
> > boot loaders or how to boot an OS - that is all in realms
> > of how software talks to software.
>
> Right -- so one question to ask here is what other uses are there
> for this outside of say HVMLite. You mentioned Multiboot so far.
>
> > 3). And there is the discussion on man-power to make this
> > happen.
>
> Sure.
>
> > 4). Lastly which one is simpler and involves less code so
> > that there is a less chance of bitrot.
>
> Indeed.
>
> You also forgot the tie-in between dead-code and semantics but

Wait, I just spoke about CPU semantics?! Which semantics
are you talking about?
> that clearly is not on your mind. But I'd say this is a good
> summary.

I put 'dead code' in the same realm as device drivers work.
And they seem to always have some issue or another.
Or maybe I getting unlucky and getting copied on those bugs.
>
> Luis