Re: [tip: x86/urgent] x86/acpi: Ignore invalid x2APIC entries

From: Zhang, Rui
Date: Fri Dec 01 2023 - 03:32:14 EST


On Thu, 2023-11-23 at 20:50 +0800, Exchange Online wrote:
> Hi, John,
>
> Thanks for catching this issue.
>
> On Wed, 2023-11-22 at 22:19 +0000, John Sperbeck wrote:
> > I have a platform with both LOCAL_APIC and LOCAL_X2APIC entries for
> > each CPU.  However, the ids for the LOCAL_APIC entries are all
> > invalid ids of 255, so they have always been skipped in
> > acpi_parse_lapic()
> > by this code from f3bf1dbe64b6 ("x86/acpi: Prevent LAPIC id 0xff
> > from
> > being
> > accounted"):
> >
> >     /* Ignore invalid ID */
> >     if (processor->id == 0xff)
> >             return 0;
> >
> > With the change in this thread, the return value of 0 means that
> > the
> > 'count' variable in acpi_parse_entries_array() is incremented.  The
> > positive return value means that 'has_lapic_cpus' is set, even
> > though
> > no entries were actually matched.
>
> So in acpi_parse_madt_lapic_entries, without this patch,
> madt_proc[0].count is a positive value on this platform, right?
>
> This sounds like a potential issue because the following checks to
> fall
> back to MPS mode can also break. (If all LOCAL_APIC entries have
> apic_id 0xff and all LOCAL_X2APIC entries have apic_id 0xffffffff)
>
> >   Then, when the MADT is iterated
> > with acpi_parse_x2apic(), the x2apic entries with ids less than 255
> > are skipped and most of my CPUs aren't recognized.
> >
> > I think the original version of this change was okay for this case
> > in
> > https://lore.kernel.org/lkml/87pm4bp54z.ffs@tglx/T/
>
> Yeah.
>
> But if we want to fix the potential issue above, we need to do
> something more.
>
> Say we can still use acpi_table_parse_entries_array() and convert
> acpi_parse_lapic()/acpi_parse_x2apic() to
> acpi_subtable_proc.handler_arg and save the real valid entries via
> the
> parameter.
>
> or can we just use num_processors & disabled_cpus to check if there
> is
> any CPU probed when parsing LOCAL_APIC/LOCAL_X2APIC entires?
>

Hi, John,

As a quick fix, I'm not going to fix the "potential issue" describes
above because we have not seen a real problem caused by this yet.

Can you please try the below patch to confirm if the problem is gone on
your system?
This patch falls back to the previous way as sent at
https://lore.kernel.org/lkml/87pm4bp54z.ffs@tglx/T/

thanks,
rui

From bdb45e241b4fea8a12b958e490979e96b064e43d Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@xxxxxxxxx>
Date: Fri, 1 Dec 2023 15:06:34 +0800
Subject: [PATCH] x86/acpi: Do strict X2APIC ID check only when an enabled CPU
is enumerated via LAPIC

Commit 8e9c42d776d6 ("x86/acpi: Ignore invalid x2APIC entries") does
strict X2APIC ID check if LAPIC contains valid CPUs by checking the
acpi_table_parse_madt() return value.

This is wrong because acpi_table_parse_madt() return value only
represents the number of legal entries parsed. For example, LAPIC entry
with LAPIC ID 0xff is counted as a legal entry, but it doesn't describe
a valid CPU.

This causes issues on a system which has 0xff LAPIC ID in all LAPIC
entries. Because the code does strict X2APIC IDs check and ignores most
of the CPUs in X2APIC entries.

Fix the problem by doing strict X2APIC ID check less aggressively, say
only when an enabled CPU is enumerated via LAPIC.

Fixes: 8e9c42d776d6 ("x86/acpi: Ignore invalid x2APIC entries")
Link: https://lore.kernel.org/all/20231122221947.781812-1-jsperbeck@xxxxxxxxxx/
Reported-by: John Sperbeck <jsperbeck@xxxxxxxxxx>
Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
---
arch/x86/kernel/acpi/boot.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 1a0dd80d81ac..8cc566ce486a 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -266,6 +266,7 @@ static int __init
acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
{
struct acpi_madt_local_apic *processor = NULL;
+ int cpu;

processor = (struct acpi_madt_local_apic *)header;

@@ -289,9 +290,13 @@ acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
* to not preallocating memory for all NR_CPUS
* when we use CPU hotplug.
*/
- acpi_register_lapic(processor->id, /* APIC ID */
- processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ cpu = acpi_register_lapic(processor->id, /* APIC ID */
+ processor->processor_id, /* ACPI ID */
+ processor->lapic_flags & ACPI_MADT_ENABLED);
+
+ /* Do strict X2APIC ID check only when an enabled CPU is enumerated via LAPIC */
+ if (cpu >= 0 )
+ has_lapic_cpus = true;

return 0;
}
@@ -1134,7 +1139,6 @@ static int __init acpi_parse_madt_lapic_entries(void)
if (!count) {
count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_APIC,
acpi_parse_lapic, MAX_LOCAL_APIC);
- has_lapic_cpus = count > 0;
x2count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_X2APIC,
acpi_parse_x2apic, MAX_LOCAL_APIC);
}
--
2.34.1