Re: [PATCH] x86/smp: Validate APIC ID before parking CPU in INIT
From: Thomas Gleixner
Date: Wed Aug 09 2023 - 14:42:29 EST
On Wed, Jul 19 2023 at 05:13, Vasant Hegde wrote:
> Below commit is causing kexec to hang in certain scenarios with >255 CPUs.
>
> Reproduce steps:
> - We are using 2 socket system with 384 CPUs
> - Booting first kernel with kernel command line intremap=off
> This disabled x2apic in kernel and booted with apic mode
> - During kexec it tries to send INIT to all CPUs except boot CPU
> If APIC ID is 0x100 (like in our case) then it will send CPU0
> to INIT mode and system hangs (in APIC mode DEST field is 8bit)
It took me a while to decode the above.
> Fix this issue by adding apic->apic_id_valid() check before sending
> INIT sequence.
Sigh, yes.
> Fixes: 45e34c8af58f ("x86/smp: Put CPUs into INIT on shutdown if possible")
> Reported-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@xxxxxxx>
> Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@xxxxxxx>
> Signed-off-by: Vasant Hegde <vasant.hegde@xxxxxxx>
> ---
> arch/x86/kernel/smpboot.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index e1aa2cd7734b..e5ca0689c4dd 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1360,7 +1360,7 @@ bool smp_park_other_cpus_in_init(void)
> if (cpu == this_cpu)
> continue;
> apicid = apic->cpu_present_to_apicid(cpu);
> - if (apicid == BAD_APICID)
> + if (apicid == BAD_APICID || !apic->apic_id_valid(apicid))
> continue;
> send_init_sequence(apicid);
> }