Re: [PATCH 8/8] arm64: cpufeature: Add an overview comment for the cpufeature framework

From: Suzuki K Poulose
Date: Thu Apr 16 2020 - 10:55:04 EST


Hi Will,

On 04/14/2020 10:31 PM, Will Deacon wrote:
Now that Suzuki isn't within throwing distance, I thought I'd better add
a rough overview comment to cpufeature.c so that it doesn't take me days
to remember how it works next time.

Signed-off-by: Will Deacon <will@xxxxxxxxxx>
---
arch/arm64/kernel/cpufeature.c | 43 ++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 680a453ca8c4..421ca99dc8fc 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3,6 +3,49 @@
* Contains CPU feature definitions
*
* Copyright (C) 2015 ARM Ltd.
+ *
+ * A note for the weary kernel hacker: the code here is confusing and hard to
+ * follow! That's partly because it's solving a nasty problem, but also because
+ * there's a little bit of over-abstraction that tends to obscure what's going
+ * on behind a maze of helper functions and macros.

Thanks for writing this up !

+ *
+ * The basic problem is that hardware folks have started gluing together CPUs
+ * with distinct architectural features; in some cases even creating SoCs where
+ * user-visible instructions are available only on a subset of the available
+ * cores. We try to address this by snapshotting the feature registers of the
+ * boot CPU and comparing these with the feature registers of each secondary
+ * CPU when bringing them up. If there is a mismatch, then we update the
+ * snapshot state to indicate the lowest-common denominator of the feature,
+ * known as the "safe" value. This snapshot state can be queried to view the

I am not sure if the following is implied above.

1) Against the "snapshot" state, where mismatches triggers updating
the "snapshot" state to reflect the "safe" value.

2) Compared against the CPU feature registers of *the boot CPU* for
"FTR_STRICT" fields and any mismatch triggers TAINT_CPU_OUT_OF_SPEC.
This makes sure that warning is generated for each OUT_OF_SPEC
secondary CPU.

+ * "sanitised" value of a feature register.
+ *
+ * The sanitised register values are used to decide which capabilities we
+ * have in the system. These may be in the form of traditional "hwcaps"
+ * advertised to userspace or internal "cpucaps" which are used to configure
+ * things like alternative patching and static keys. While a feature mismatch
+ * may result in a TAINT_CPU_OUT_OF_SPEC kernel taint, a capability mismatch
+ * may prevent a CPU from being onlined at all.
+ *
+ * Some implementation details worth remembering:
+ *
+ * - Mismatched features are *always* sanitised to a "safe" value, which
+ * usually indicates that the feature is not supported.
+ *
+ * - A mismatched feature marked with FTR_STRICT will cause a "SANITY CHECK"
+ * warning when onlining an offending CPU and the kernel will be tainted
+ * with TAINT_CPU_OUT_OF_SPEC.

As mentioned above, this check is against that of the "boot CPU"
register state, which may not be implicit from the statement.

+ *
+ * - Features marked as FTR_VISIBLE have their sanitised value visible to
+ * userspace. FTR_VISIBLE features in registers that are only visible
+ * to EL0 by trapping *must* have a corresponding HWCAP so that late
+ * onlining of CPUs cannot lead to features disappearing at runtime.
+ *

As you mentioned in the other response we could add information about
the guest view, something like :

- KVM exposes the sanitised value of the feature registers to the
guests and is not affected by the FTR_VISIBLE. However,
depending on the individual feature support in the hypervisor,
some of the fields may be capped/limited.

Cheers
Suzuki