AW: x86/topology: Handle bogus ACPI tables correctly

From: Carsten Tolkmit
Date: Fri May 17 2024 - 16:19:33 EST


Hi Thomas,

I can confirm that your revised patch works as well as your first one.

Thanks!
Carsten

Von: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Gesendet: Freitag, 17. Mai 2024 16:40
An: Carsten Tolkmit <ctolkmit@xxxxxx>
Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>; x86@xxxxxxxxxx <x86@xxxxxxxxxx>
Betreff: x86/topology: Handle bogus ACPI tables correctly
 
The ACPI specification clearly states how the processors should be
enumerated in the MADT:

 "To ensure that the boot processor is supported post initialization,
  two guidelines should be followed. The first is that OSPM should
  initialize processors in the order that they appear in the MADT. The
  second is that platform firmware should list the boot processor as the
  first processor entry in the MADT.
  ...
  Failure of OSPM implementations and platform firmware to abide by
  these guidelines can result in both unpredictable and non optimal
  platform operation."

The kernel relies on that ordering to detect the real BSP on crash kernels
which is important to avoid sending a INIT IPI to it as that would cause a
full machine reset.

On a Dell XPS 16 9640 the BIOS ignores this rule and enumerates the CPUs in
the wrong order. As a consequence the kernel falsely detects a crash kernel
and disables the corresponding CPU.

Prevent this by checking the IA32_APICBASE MSR for the BSP bit on the boot
CPU. If that bit is set, then the MADT based BSP detection can be safely
ignored. If the kernel detects a mismatch between the BSP bit and the first
enumerated MADT entry then emit a firmware bug message.

This obviously also has to be taken into account when the boot APIC ID and
the first enumerated APIC ID match. If the boot CPU does not have the BSP
bit set in the APICBASE MSR then there is no way for the boot CPU to
determine which of the CPUs is the real BSP. Sending an INIT to the real
BSP would reset the machine so the only sane way to deal with that is to
limit the number of CPUs to one and emit a corresponding warning message.

Fixes: 5c5682b9f87a ("x86/cpu: Detect real BSP on crash kernels")
Reported-by: Carsten Tolkmit <ctolkmit@xxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218837
---

This is a slightly different solution than the initial patch I provided in
the bugzilla. Carsten, can you please test that again?
---
 arch/x86/kernel/cpu/topology.c |   43 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 40 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -128,6 +128,9 @@ static void topo_set_cpuids(unsigned int
 
 static __init bool check_for_real_bsp(u32 apic_id)
 {
+       bool is_bsp = false, has_apic_base = boot_cpu_data.x86 >= 6;
+       u64 msr;
+
         /*
          * There is no real good way to detect whether this a kdump()
          * kernel, but except on the Voyager SMP monstrosity which is not
@@ -144,17 +147,51 @@ static __init bool check_for_real_bsp(u3
         if (topo_info.real_bsp_apic_id != BAD_APICID)
                 return false;
 
+       /*
+        * Check whether the enumeration order is broken by evaluating the
+        * BSP bit in the APICBASE MSR. If the CPU does not have the
+        * APICBASE MSR then the BSP detection is not possible and the
+        * kernel must rely on the firmware enumeration order.
+        */
+       if (has_apic_base) {
+               rdmsrl(MSR_IA32_APICBASE, msr);
+               is_bsp = !!(msr & MSR_IA32_APICBASE_BSP);
+       }
+
         if (apic_id == topo_info.boot_cpu_apic_id) {
-               topo_info.real_bsp_apic_id = apic_id;
-               return false;
+               if (is_bsp || !has_apic_base) {
+                       topo_info.real_bsp_apic_id = apic_id;
+                       return false;
+               }
+               /*
+                * If the boot APIC is enumerated first, but the APICBASE
+                * MSR does not have the BSP bit set, then there is no way
+                * to discover the real BSP here. Assume a crash kernel and
+                * limit the number of CPUs to 1 as an INIT to the real BSP
+                * would reset the machine.
+                */
+               pr_warn("Enumerated BSP APIC %x is not marked in APICBASE MSR\n", apic_id);
+               pr_warn("Assuming crash kernel. Limiting to one CPU to prevent machine INIT\n");
+               set_nr_cpu_ids(1);
+               goto fwbug;
         }
 
-       pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x > %x\n",
+       pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x != %x\n",
                 topo_info.boot_cpu_apic_id, apic_id);
+
+       if (is_bsp && has_apic_base) {
+               topo_info.real_bsp_apic_id = topo_info.boot_cpu_apic_id;
+               goto fwbug;
+       }
+
         pr_warn("Crash kernel detected. Disabling real BSP to prevent machine INIT\n");
 
         topo_info.real_bsp_apic_id = apic_id;
         return true;
+
+fwbug:
+       pr_warn(FW_BUG "APIC enumeration order not specification compliant\n");
+       return false;
 }
 
 static unsigned int topo_unit_count(u32 lvlid, enum x86_topology_domains at_level,

https://www.tng.de

Executive board (Geschäftsführer):
Dr. Sven Willert (CEO/Vorsitz),
Gunnar Peter, Sven Schade,
Carsten Tolkmit, Bernd Sontheimer

Amtsgericht Kiel HRB 6002 KI
USt-ID: DE225201428
Die Information über die Verarbeitung Ihrer Daten
gemäß Artikel 12 DSGVO können Sie unter https://www.tng.de/datenschutz/ abrufen.
______________________________________________________________________

Attachment: smime.p7s
Description: S/MIME cryptographic signature