Re: [PATCH v2 02/11] xen/hvmlite: Bootstrap HVMlite guest
From: Luis R. Rodriguez
Date: Thu Feb 04 2016 - 15:57:29 EST
On Thu, Feb 04, 2016 at 02:54:15PM -0500, Boris Ostrovsky wrote:
> On 02/03/2016 06:40 PM, Luis R. Rodriguez wrote:
> >On Wed, Feb 03, 2016 at 03:11:56PM -0500, Boris Ostrovsky wrote:
> >>On 02/03/2016 01:55 PM, Luis R. Rodriguez wrote:
> >>>I saw no considerations for the recommendations I had made last on your v1:
> >>>
> >>>https://lkml.kernel.org/r/CAB=NE6XPA0YzbnM8=rspkKai6d3GkXXO00Gr0VZUYoyzNy6thw@xxxxxxxxxxxxxx
> >>>
> >>>Of importance:
> >>>
> >>>1) Using pv_info.paravirt_enabled = 1 is wrong unless you mean to say this
> >>> is for legacy x86:
> >>>
> >>>Your patch #3 keeps on setting pv_info.paravirt_enabled = 1 and as discussed
> >>>this is wrong. It will be renamed to x86_legacy_free() to align with what folks
> >>>are pushing for a BIOS flag to annotate if a system requires legacy x86 stuff.
> >>>This also means re-thinking all use cases and ensuring subarch is used then
> >>>instead when the goal was to avoid Xen from entering that code. Today Xen does
> >>>not use this but with my work it does and it helps clean and brush up a lot of
> >>>these checks with future prospects to even help unify entry points.
> >>As I said earlier, I am not sure I understand what subarch buys us
> >>for HVMlite guests.
> >I accepted subarch may not be the right thing, so proposed a hypervisor type.
>
> I don't see much difference between having an HV-specific subarch
> and a hypervisor type.
Ah, well here lies the issue. As per hpa subarch was not designed for defining
a hypervisor, but rather at least subarch PC (0) [should be used if the
hardware is] "enumerable using standard PC mechanisms (PCI, ACPI) and doesn't
need a special boot flow". Does that follow the definition of HVMlite?
I was pointing out to hpa how paravirt_enabled() has limitations in that it is
set late and as such only logically be available for all users after
setup_arch(), so I figured we could repurpose subarch for a hypervisor type.
He noted:
"If you have a genuine need for a "hypervisor type" then that is a
separate thing and should be treated separately from subarch. However,
you need to consider that some hypervisors can emulate other hypervisors
and you may have more than one hypervisor API available."
> >What it buys you is a strong semantics association between code designed
> >for a purpose.
> >
> >>As for using paravirt_enabled -- this is really only used to
> >>differentiate HVM from HVMlite and I think (although I'd need to
> >>check) is only needed by Xen-specific code in a couple of places.
> >That sounds like a Xen specific use case as such an interface that is
> >pointed out as going to renamed to reflect its actual use case should not
> >be abused for that purpose.
> >
> >>So if/when it is removed we will switch to something else. Since your work is
> >>WIP I decided to keep using it until it's clear what other options may be
> >>available.
> >And your work is not WIP? I'll be splitting my patches up and the rename
> >will be atomic, it likely can go in first than yours, so not sure why you
> >are simply brushing this off.
>
> I didn't mean to imply anything by saying that your patches are a
> WIP. It's just that I can only write and test my patches against
> existing code, not the future one.
>
> I am sorry if you felt I was trying to say something else, it
> certainly was not my intent.
I don't really care about that, my point was that we both are working on
similar areas right now and both efforts are helping us clean up the init
path and give us better semantics, we should take both patch series into
consideration as they *both* are being reviewed now. The definition and use
of subarch at least of importance here for HVMLite in consideration for
future cleanup.
> >>>2) We should avoid more hypervisor type hacks, and just consider a new
> >>> hypervisor type to close the gap:
> >>>
> >>>Using x86_legacy_free() and friends in a unified way for all systems means it
> >>>should only be used after init_hypervisor_platform() which is called during
> >>>setup_arch(). This means we have a semantic gap for checks on "are we on
> >>>hypervisor type and which one?".
> >>In this particular case we don't need any information about
> >>hypervisor until init_hypervisor_platform().
> >I pointed out in your v1 patchset how microcode loading was not blocked, you
> >then asked how KVM does it, and that was explained as well, and that they
> >don't enable it as well. You need a solution for this.
>
> Not really. Xen will ignore writes to microcode-specific MSRs, just
> like KVM.
>
> This is exact same behavior we have now with regular HVM guests.
OK great. That still means the code will run, and if we can avoid that
why not. I am fine with annotating this as future work to help. Let me
then ask as well, how about the rest of the code during and after
startup_32() and startup_64() -- are we sure that's all safe ?
> >As-is the x86 boot protocol would not allow an easy way for this,
> >I'm suggesting we consider extending the boot protocol to add a
> >hypervisor type and data pointer much as with subarch and
> >subarch_data for the
>
> Who will set hypervisor type and where? It won't be Xen as Andrew
> mentioned in another email.
Andrew seems to think I'm after some senseless prodding, there are
good reasons to consider setting at least a type and custom data
pointer, and in fact I think there are gains for this not only for
Linux but other OSes; so I'll keep working on my arguments there.
> >particular purpose of both enabling entry into the same startup_32()
> >but also a clean way for modifications of stubs both at the beginning
> >and at the end of startup_32().
> >
> >Pseudo code:
> >
> >startup_32() startup_64()
> > | |
> > | |
> > V V
> >pre_hypervisor_stub_32() pre_hypervisor_stub_64()
> > | |
> > | |
> > V V
> > [existing startup_32()] [existing startup_64()]
> > | |
> > | |
> > V V
> >post_hypervisor_stub_32() post_hypervisor_stub_64()
> >
> >The pre_hypervisor_stub_32() would have much of the code in
> >hvmlite_start_xen() but for 32-bit, pre_hypervisor_stub_64()
> >would have the 64-bits.
>
>
> Sure. When the protocol is agreed upon and this code is written we
> will just move hvmlite_start_xen() to pre_hypervisor_stub_32().
OK fair enough.
> >+int xen_hvmlite __attribute__((section(".data"))) = 0;
> >+struct hvm_start_info hvmlite_start_info __attribute__((section(".data")));
> >+uint hvmlite_start_info_sz = sizeof(hvmlite_start_info);
> >+struct boot_params xen_hvmlite_boot_params __attribute__((section(".data")));
> >+#endif
> >+
> >>>The section annotations seems very special use case but likely worth documenting
> >>>and defining a new macro for in include/linux/compiler.h. This would make it
> >>>easier to change should we want to change the section used here later and
> >>>enable others to easily look for the reason for these annotations in a
> >>>single place.
> >>I wonder whether __initdata would be a good attribute. We only need
> >>this early in the boot.
> >I could not find other users of .data other than some specific driver.
> >Using anything with *init* alludes you can free the data later but if we
> >want to keep it I suggest a different prefix, up to you.
>
> That's why I said that we only need this info early in the boot.
Still -- better just document and add a shared macro for it.
Luis