Re: [patch v3 08/36] x86/smpboot: Split up native_cpu_up() into separate phases and document them

From: Peter Zijlstra
Date: Tue May 09 2023 - 06:06:52 EST


On Mon, May 08, 2023 at 09:43:39PM +0200, Thomas Gleixner wrote:

> @@ -233,14 +237,31 @@ static void notrace start_secondary(void
> load_cr3(swapper_pg_dir);
> __flush_tlb_all();
> #endif
> + /*
> + * Sync point with wait_cpu_initialized(). Before proceeding through
> + * cpu_init(), the AP will call wait_for_master_cpu() which sets its
> + * own bit in cpu_initialized_mask and then waits for the BSP to set
> + * its bit in cpu_callout_mask to release it.
> + */
> cpu_init_secondary();
> rcu_cpu_starting(raw_smp_processor_id());
> x86_cpuinit.early_percpu_clock_init();
> +
> + /*
> + * Sync point with wait_cpu_callin(). The AP doesn't wait here
> + * but just sets the bit to let the controlling CPU (BSP) know that
> + * it's got this far.
> + */
> smp_callin();
>
> - /* otherwise gcc will move up smp_processor_id before the cpu_init */
> + /* Otherwise gcc will move up smp_processor_id() before cpu_init() */
> barrier();

Not to the detriment of this patch, but this barrier() and it's comment
seem weird vs smp_callin(). That function ends with an atomic bitop (it
has to, at the very least it must not be weaker than store-release) but
also has an explicit wmb() to order setup vs CPU_STARTING.

(arguably that should be a full fence *AND* get a comment)

There is no way the smp_processor_id() referred to in this comment can
land before cpu_init() even without the barrier().

> - /* Check TSC synchronization with the control CPU: */
> +
> + /*
> + * Check TSC synchronization with the control CPU, which will do
> + * its part of this from wait_cpu_online(), making it an implicit
> + * synchronization point.
> + */
> check_tsc_sync_target();
>
> /*