Re: [PATCH v3 01/11] x86/boot: enumerate documentation for the x86 hardware_subarch
From: Luis R. Rodriguez
Date: Tue Feb 23 2016 - 05:34:49 EST
On Tue, Feb 23, 2016 at 09:51:19AM +0100, Ingo Molnar wrote:
>
> * Luis R. Rodriguez <mcgrof@xxxxxxxxxx> wrote:
>
> > Although hardware_subarch has been in place since the x86 boot
> > protocol 2.07 it hasn't been used much. Enumerate current possible
> > values to avoid misuses and help with semantics later at boot
> > time should this be used further.
> >
> > v2: fix typos
> >
> > Cc: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
> > ---
> > arch/x86/include/uapi/asm/bootparam.h | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> > index 329254373479..50d5009cf276 100644
> > --- a/arch/x86/include/uapi/asm/bootparam.h
> > +++ b/arch/x86/include/uapi/asm/bootparam.h
> > @@ -157,7 +157,36 @@ struct boot_params {
> > __u8 _pad9[276]; /* 0xeec */
> > } __attribute__((packed));
> >
> > -enum {
> > +/**
> > + * enum x86_hardware_subarch - x86 hardware subarchitecture
> > + *
> > + * The x86 hardware_subarch and hardware_subarch_data were added as of the x86
> > + * boot protocol 2.07 to help distinguish and supports custom x86 boot
> > + * sequences. This enum represents accepted values for the x86
> > + * hardware_subarch. Custom x86 boot sequences (not X86_SUBARCH_PC) do not have
> > + * or simply do not make use of natural stubs like BIOS or EFI, the
> > + * hardware_subarch can be used on the Linux entry path to revector to a
> > + * subarchitecture stub when needed. This subarchitecture stub can be used to
> > + * set up Linux boot parameters or for special care to account for nonstandard
> > + * handling of page tables.
> > + *
> > + * KVM and Xen HVM do not have a subarch as these are expected to follow
> > + * standard x86 boot entries. If there is a genuine need for "hypervisor" type
> > + * that should be considered separately in the future.
> > + *
> > + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable using standard
> > + * PC mechanisms (PCI, ACPI) and doesn't need a special boot flow.
> > + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> > + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV boot path,
> > + * which start at asm startup_xen() entry point and later jump to the C
> > + * xen_start_kernel() entry point.
> > + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet Device) platform
> > + * systems which do not have the PCI legacy interfaces.
> > + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100) SOC for
> > + * for settop boxes and media devices, the use of a subarch for CE4100
> > + * is more of a hack...
> > + */
> > +enum x86_hardware_subarch {
> > X86_SUBARCH_PC = 0,
> > X86_SUBARCH_LGUEST,
> > X86_SUBARCH_XEN,
>
> No, this is really backwards.
>
> While I agree that we want to get rid of paravirt_enabled(), we _dont_ want to
> spread the use of (arguably broken) boot flags like this!
I agree that we should not see the spread of boot flags around general x86
code, its not my goal to spread it though, the code that uses it here though is
*early boot code* (although in retrospect the pnpbios use was a fuckup), and I
have some special considerations for early boot code which I think does give
merit to it use. But also keep in mind my goal is to rather fold the boot flag
so its more just an architectural consideration eventually. More on this below.
> This is one of the cases
> where the cure is worse than the disease.
>
> The 'modern' way to handle platform quirks is to have hardware drivers with
> auto-detection, and drivers know how to detect whether the hardware is present or
> not. For legacy cases where no auto-detection is possible, we have per hardware
> capability flags to turn it off.
>
> The 'hardware subarch' flag of the bootloader can be used to install certain
> quirks, in a single early bootup function - but that should be all: ideally no
> quirks are needed. We don't want to spread 'subarch flags' into various unrelated
> code.
There's another possible use for the subarch which I intend on using later for
early boot code to proactively help with disconnect on modifications of the
different x86 entry points. A summary of this is, at times x86 may get some
enhancement / feature which requires some small modifications on early boot
code and if only the bare metal entry point gets vetted on development then
chances are high the xen entry path will be faulty. At times some "feature"
or enhancement may be rather small, and all we need is a respective call
placed on the Xen entry path. Such was the case when the cr4 shadow was
added, since Xen didn't get it Xen crashed. Adding the call on the Xen
entry path fixed the issue later. Other times features may require more
work, such is the case with Kasan. Kasan currently only works on bare
metal and KVM, but not on Xen when PV guests are used. There was an oversight
on review. When I pointed the possible issue with Kasan it was thought that
adding support for Kasan for Xen would be trivial but upon review with the
Kasan maintainer we've realized it may take a bit more work. I've also
confirmed that enabling it on Xen crashes it. There's no fix in sight
and we can't disable Kasan at run time...
What I'm after is a proactive stop-gap that helps us ensure *early boot*
requirements are spelled out so that if some developer is adding a new
feature which mucks with *early boot code* they'll have to consider both entry
points. It will be fine if they only add support for bare metal -- but at least
it will be clearly annotated then. With this in place for instance I would
have expected Kasan would not have been merged unless Xen was also supportedddd
or at the very least Kasan got support to be disabled at run time in early
boot (and then Xen would disable it until it gets support for it). I won't bore
you with more details but if you want to read more about what I mean:
http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
http://www.do-not-panic.com/2015/12/xen-and-x86-linux-zero-page.html
Now just keep this type of proactive use case in mind for now, its one of
the things I was originally after with the linker tables and x86's use
for it.
> Let's go over your series and see whether and how that principle can be applied:
>
> - patch #1, #2: should be dropped
>
> - patch #3, #4: EBDA support.
>
> The EBDA BIOS signature is an ancient data structure, starting off at
> physical memory 0x40E - which is the very first physical memory page of the
> system.
>
> We should add an x86_ebda_bios flag that is set to 1 by default, but which
> paravirt bootup can set to 0. That would avoid the reservation of the BIOS
> area and will save a bit of RAM.
Using a flag is certainly one way to go about it. With the subarch approach
I just wanted to point out that my goal was to eventually fold it underneath
the code, we'd just declare the ebda reservation as follows:
x86_init_early_pc_simple(reserve_ebda_region);
See:
http://lkml.kernel.org/r/1455891343-10016-4-git-send-email-mcgrof@xxxxxxxxxx
Since linker tables are used the ebda call would just be one of the init
calls that x86_init_fn_early_init() would call. The relevant stub for that:
void __init x86_init_fn_early_init(void)
{
struct x86_init_fn *init_fn;
unsigned int num_inits = LINKTABLE_SIZE(x86_init_fns);
if (!num_inits)
return;
pr_debug("Number of init entries: %d\n", num_inits);
LINKTABLE_FOR_EACH(init_fn, x86_init_fns) {
if (!x86_init_fn_supports_subarch(init_fn))
continue;
...
init_fn->early_init();
}
}
And its helper:
static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn)
{
if (!fn->supp_hardware_subarch) {
pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init);
WARN_ON(1);
}
if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch)
return true;
return false;
}
struct x86_init_fn would have a __u32 supp_hardware_subarch of supported
subarchs as a bitmask.
What gains would we have to fold such a subarch check on early code?
Well for starters it can provide a proactive means stop-gap measure
for discrepancies between the different entry points to early boot
code. When developers write new early boot code they'd have to
think about which subarchs are supported.
So we can take either approach:
a) use subarch with the plan to only use it only early x86
boot code, in this case something like x86_init_early_pc_simple().
b) add x86_ebda_bios flag and have PV disable it as you note
But I think if we wanted to remain compatible with a proactive strategy
using a) strategy we'd need a flag 'per feature'? I think approach a) scales
a bit more easily in this regard.
Mind you, if you say ebda should not use subarch I understand that it does not
mean other code which I'm adding should not use this linker table arrangement
and subarch, in particular if the subarch was already used (see use of linker
table on MID early boot code [0]), but some guidance if it should not
even be considered for something proactive as I'm trying to achieve as
with kasan (as an example) would be appreciated.
[0] http://lkml.kernel.org/r/1455891343-10016-7-git-send-email-mcgrof@xxxxxxxxxx
> - patch #5, #6, #7: looks good, does not use a subarch flag
OK thanks.
> - patch #8: f00f workaround. Subarch flag use is wrong. The complication with
> this workaround is that it uses MM tricks to install an IDT. Could you check
> whether Xen truly needs this quirk?
I reviewed this at the Xen developer summit and have been through these
patches: the theory goes that xen and lguest do not need the f00f workaround
(so as per your language needs the quirk) as the hypervisor should be the one
doing the work around. Konrad also mentioned though that he actually thought
that perhaps the work around alone could be removed from the kernel all
together already -- but he was unable to provide a direct confirmation of that.
So yes we need it.
> If yes then there should be a new flag,
> something like x86_idt_readonly, which is set to 0 but Xen can set it to 1. If
> that flag is set then the F00F workaround does not have to be installed.
>
> Or something like that: the point is to use a specific flag.
OK.
> - Patch #9, #10: RTC support. The problem with RTC platform driver is that it's
> not possible to detect the RTC reliably - so we sometimes have to quirk it off.
>
> Instead of using bitflags, add something like x86_platform.rtc_available, which
> defaults to 1. Don't add negation to the name and don't use bitflags - use a
> byte flag.
OK.
> - Patch #11: this patch wants to disable the PNP BIOS code.
>
> The complication with the PNP BIOS code is that the PNP BIOS is defined
> in physical memory, in the 0xf0000-0xffff0 memory range - a 64kB large area.
> Paravirt images don't want to waste 64kB of RAM just to tell the kernel that no
> PNP BIOS is available.
>
> But instead of using subarch flags, please define a pnpbios_disable() API call
> that paravirt bootup can call into. That can disable the PNP BIOS code.
OK, sure, much better thanks.
> Also, feel free to define a single 'disable PC legacies' quirk function that
> disables all the usual PC legacies that paravirt does not care about or does not
> want to define via RAM. Specific paravirt initialization functions would only have
> to call this one (shared) function.
Sure.
Luis