Re: [PATCH v2 2/6] arm64: alternative: Apply alternatives early in boot process

From: Julien Thierry
Date: Fri May 11 2018 - 04:13:10 EST

On 09/05/18 22:52, Suzuki K Poulose wrote:
On 05/04/2018 11:06 AM, Julien Thierry wrote:

In order to prepare the v3 of this patchset, I'd like people's opinion on what this patch does. More below.

On 17/01/18 11:54, Julien Thierry wrote:
From: Daniel Thompson <daniel.thompson@xxxxxxxxxx>

Currently alternatives are applied very late in the boot process (and
a long time after we enable scheduling). Some alternative sequences,
such as those that alter the way CPU context is stored, must be applied
much earlier in the boot sequence.

+ * early-apply features are detected using only the boot CPU and checked on
+ * secondary CPUs startup, even then,
+ * These early-apply features should only include features where we must
+ * patch the kernel very early in the boot process.
+ *
+ * Note that the cpufeature logic *must* be made aware of early-apply
+ * features to ensure they are reported as enabled without waiting
+ * for other CPUs to boot.
+ */

Following the change in the cpufeature infrastructure, ARM64_HAS_SYSREG_GIC_CPUIF will have the scope ARM64_CPUCAP_SCOPE_BOOT_CPU in order to be checked early in the boot process.

Thats correct.

Now, regarding the early application of alternative, I am wondering whether we can apply all the alternatives associated with SCOPE_BOOT features that *do not* have a cpu_enable callback.

I don't understand why would you skip the ones that have a "cpu_enable" callback. Could you explain this a bit ? Ideally you should be able to
apply the alternatives for features with the SCOPE_BOOT, provided the
cpu_enable() callback is written properly.

In my mind the "cpu_enable" callback is the setup a cpu should perform before using the feature (i.e. the code getting patched in by the alternative). So I was worried about the code getting patched by the boot cpu and then have the secondary cpus ending up executing patched code before the cpu_enable for the corresponding feature gets called.
Or is there a requirement for secondary cpu startup code to be free of alternative code?

Otherwise we can keep the macro to list individually each feature that is patchable at boot time as the current patch does (or put this info in a flag within the arm64_cpu_capabilities structure)

You may be able to build up the mask of *available* capabilities with SCOPE_BOOT at boot time by playing some trick in the setup_boot_cpu_capabilities(), rather than embedding it in the capabilities (and then parsing the entire table(s)) or manually keeping
track of the capabilities by having a separate mask.

Yes, I like that idea.



Any thoughts or preferences on this?


 #define __ALT_PTR(a,f) ((void *)&(a)->f + (a)->f)
 #define ALT_ORIG_PTR(a) __ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a) __ALT_PTR(a, alt_offset)
@@ -105,7 +117,8 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
ÂÂÂÂÂ return insn;

-static void __apply_alternatives(void *alt_region, bool use_linear_alias)
+static void __apply_alternatives(void *alt_region, bool use_linear_alias,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long feature_mask)
ÂÂÂÂÂ struct alt_instr *alt;
ÂÂÂÂÂ struct alt_region *region = alt_region;
@@ -115,6 +128,9 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
ÂÂÂÂÂÂÂÂÂ u32 insn;
ÂÂÂÂÂÂÂÂÂ int i, nr_inst;

+ÂÂÂÂÂÂÂ if ((BIT(alt->cpufeature) & feature_mask) == 0)
+ÂÂÂÂÂÂÂÂÂÂÂ continue;
ÂÂÂÂÂÂÂÂÂ if (!cpus_have_cap(alt->cpufeature))

@@ -138,6 +154,21 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)

+ * This is called very early in the boot process (directly after we run
+ * a feature detect on the boot CPU). No need to worry about other CPUs
+ * here.
+ */
+void apply_alternatives_early(void)
+ÂÂÂ struct alt_region region = {
+ÂÂÂÂÂÂÂ .beginÂÂÂ = (struct alt_instr *)__alt_instructions,
+ÂÂÂÂÂÂÂ .endÂÂÂ = (struct alt_instr *)__alt_instructions_end,
+ÂÂÂ };
+ÂÂÂ __apply_alternatives(&region, true, EARLY_APPLY_FEATURE_MASK);
ÂÂ * We might be patching the stop_machine state machine, so implement a
ÂÂ * really simple polling protocol here.
ÂÂ */
@@ -156,7 +187,9 @@ static int __apply_alternatives_multi_stop(void *unused)
ÂÂÂÂÂ } else {
-ÂÂÂÂÂÂÂ __apply_alternatives(&region, true);
+ÂÂÂÂÂÂÂ __apply_alternatives(&region, true, ~EARLY_APPLY_FEATURE_MASK);
ÂÂÂÂÂÂÂÂÂ /* Barriers provided by the cache flushing */
@@ -177,5 +210,5 @@ void apply_alternatives(void *start, size_t length)
ÂÂÂÂÂÂÂÂÂ .endÂÂÂ = start + length,

-ÂÂÂ __apply_alternatives(&region, false);
+ÂÂÂ __apply_alternatives(&region, false, -1);
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 551eb07..37361b5 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -453,6 +453,12 @@ void __init smp_prepare_boot_cpu(void)
ÂÂÂÂÂÂ * cpuinfo_store_boot_cpu() above.
ÂÂÂÂÂ update_cpu_errata_workarounds();
+ÂÂÂ /*
+ÂÂÂÂ * We now know enough about the boot CPU to apply the
+ÂÂÂÂ * alternatives that cannot wait until interrupt handling
+ÂÂÂÂ * and/or scheduling is enabled.
+ÂÂÂÂ */
+ÂÂÂ apply_alternatives_early();

 static u64 __init of_get_cpu_mpidr(struct device_node *dn)

Julien Thierry