[PATCH v3 05/12] arm64: smp: Defer RCU registration during secondary CPU bringup

From: Jinjie Ruan

Date: Wed Jun 24 2026 - 05:27:28 EST


From: Will Deacon <will@xxxxxxxxxx>

Calling rcutree_report_cpu_starting() early during boot can lead to
livelocks with the generic CPU hotplug mechanism if the boot CPU blocks
on an RCU grace period while the CPU being onlined is spinning in
cpuhp_ap_sync_alive(). So cpuhp_ap_sync_alive() must be called
before rcutree_report_cpu_starting().

And to prevent a potential deadlock on the boot CPU,
check_local_cpu_capabilities() must be executed before
cpuhp_ap_sync_alive(). This ensures that if an early capability mismatch
occurs and the AP invokes cpu_die_early(), the boot CPU can detect
the boot timeout and proceed, rather than hanging indefinitely.

In preparation for enabling the generic CPU hotplug code on arm64, split
up the trace_hardirqs_off() call during secondary CPU bringup so that we
update lockdep early but defer the tracing updates until after
RCU is ready.

Furthermore, to support parallel bringup without triggering false RCU CPU
stall Warnings or deadlocks, the initialization order must be:

secondary_start_kernel()
-> lockdep_hardirqs_off()
-> check_local_cpu_capabilities()
-> cpuhp_ap_sync_alive()
-> cpuhp_ap_sync_alive()
-> rcutree_report_cpu_starting()
-> trace_hardirqs_off_finish()

Because check_local_cpu_capabilities() must execute while RCU is still
offline on the local CPU, it normally triggers a false-positive lockdep
"suspicious RCU usage" splat during early lock acquisitions as commit
ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") pointed out.

Resolve this lockdep splat by wrapping the early capability verification
path within lockdep_off() and lockdep_on(). This safely suppresses
false-positive RCU validation flags on the offline CPU while maintaining
the strictly mandated initialization order for race-free parallel bringup.

Signed-off-by: Will Deacon <will@xxxxxxxxxx>
Co-developed-by: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>
Signed-off-by: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>
---
arch/arm64/kernel/smp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index c14b179c595d..87f92cf9ffa8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -35,6 +35,7 @@
#include <linux/kgdb.h>
#include <linux/kvm_host.h>
#include <linux/nmi.h>
+#include <linux/lockdep.h>

#include <asm/alternative.h>
#include <asm/atomic.h>
@@ -215,15 +216,23 @@ asmlinkage notrace void secondary_start_kernel(void)
if (system_uses_irq_prio_masking())
init_gic_priority_masking();

- rcutree_report_cpu_starting(cpu);
- trace_hardirqs_off();
+ lockdep_hardirqs_off(CALLER_ADDR0);

+ /*
+ * Since RCU is still offline on this CPU, any nested native printk
+ * or lock acquisition would normally trigger a false-positive
+ * "suspicious RCU usage" lockdep splat.
+ */
+ lockdep_off();
/*
* If the system has established the capabilities, make sure
* this CPU ticks all of those. If it doesn't, the CPU will
* fail to come online.
*/
check_local_cpu_capabilities();
+ lockdep_on();
+ rcutree_report_cpu_starting(cpu);
+ trace_hardirqs_off_finish();

ops = get_cpu_ops(cpu);
if (ops->cpu_postboot)
--
2.34.1