Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework
From: Will Deacon
Date: Thu Apr 16 2020 - 14:12:15 EST
On Thu, Apr 16, 2020 at 03:59:39PM +0100, Suzuki K Poulose wrote:
> On 04/14/2020 10:31 PM, Will Deacon wrote:
> > Now that Suzuki isn't within throwing distance, I thought I'd better add
> > a rough overview comment to cpufeature.c so that it doesn't take me days
> > to remember how it works next time.
> >
> > Signed-off-by: Will Deacon <will@xxxxxxxxxx>
> > ---
> > arch/arm64/kernel/cpufeature.c | 43 ++++++++++++++++++++++++++++++++++
> > 1 file changed, 43 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 680a453ca8c4..421ca99dc8fc 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -3,6 +3,49 @@
> > * Contains CPU feature definitions
> > *
> > * Copyright (C) 2015 ARM Ltd.
> > + *
> > + * A note for the weary kernel hacker: the code here is confusing and hard to
> > + * follow! That's partly because it's solving a nasty problem, but also because
> > + * there's a little bit of over-abstraction that tends to obscure what's going
> > + * on behind a maze of helper functions and macros.
>
> Thanks for writing this up !
It's purely a selfish thing ;)
> > + * The basic problem is that hardware folks have started gluing together CPUs
> > + * with distinct architectural features; in some cases even creating SoCs where
> > + * user-visible instructions are available only on a subset of the available
> > + * cores. We try to address this by snapshotting the feature registers of the
> > + * boot CPU and comparing these with the feature registers of each secondary
> > + * CPU when bringing them up. If there is a mismatch, then we update the
> > + * snapshot state to indicate the lowest-common denominator of the feature,
> > + * known as the "safe" value. This snapshot state can be queried to view the
>
> I am not sure if the following is implied above.
>
> 1) Against the "snapshot" state, where mismatches triggers updating
> the "snapshot" state to reflect the "safe" value.
>
> 2) Compared against the CPU feature registers of *the boot CPU* for
> "FTR_STRICT" fields and any mismatch triggers TAINT_CPU_OUT_OF_SPEC.
> This makes sure that warning is generated for each OUT_OF_SPEC
> secondary CPU.
I was trying to avoid talking about the consequences of a mismatch in that
paragraph, and instead cover them below:
> > + * The sanitised register values are used to decide which capabilities we
> > + * have in the system. These may be in the form of traditional "hwcaps"
> > + * advertised to userspace or internal "cpucaps" which are used to configure
> > + * things like alternative patching and static keys. While a feature mismatch
> > + * may result in a TAINT_CPU_OUT_OF_SPEC kernel taint, a capability mismatch
> > + * may prevent a CPU from being onlined at all.
Do you think something is missing here?
> > + *
> > + * Some implementation details worth remembering:
> > + *
> > + * - Mismatched features are *always* sanitised to a "safe" value, which
> > + * usually indicates that the feature is not supported.
> > + *
> > + * - A mismatched feature marked with FTR_STRICT will cause a "SANITY CHECK"
> > + * warning when onlining an offending CPU and the kernel will be tainted
> > + * with TAINT_CPU_OUT_OF_SPEC.
>
> As mentioned above, this check is against that of the "boot CPU"
> register state, which may not be implicit from the statement.
Hmm, I'm trying to figure out if this matters. I suppose this means you
get a SANITY CHECK warning for every mismatching secondary CPU, but that's
implied by the above. Is there something else I'm missing?
> > + *
> > + * - Features marked as FTR_VISIBLE have their sanitised value visible to
> > + * userspace. FTR_VISIBLE features in registers that are only visible
> > + * to EL0 by trapping *must* have a corresponding HWCAP so that late
> > + * onlining of CPUs cannot lead to features disappearing at runtime.
> > + *
>
> As you mentioned in the other response we could add information about
> the guest view, something like :
>
> - KVM exposes the sanitised value of the feature registers to the
> guests and is not affected by the FTR_VISIBLE. However,
> depending on the individual feature support in the hypervisor,
> some of the fields may be capped/limited.
In light of Marc's comment, I'll add something here along the lines of:
"KVM exposes its own view of the feature registers to guest operating
systems regardless of FTR_VISIBLE. This is typically driven from the
sanitised register values to allow virtual CPUs to be migrated between
arbitrary physical CPUs, but some features not present on the host are
also advertised and emulated. Look at sys_reg_descs[] for the gory
details."
Will