Re: [tip: x86/urgent] x86/acpi: Ignore invalid x2APIC entries

From: Zhang, Rui
Date: Fri Dec 01 2023 - 22:01:22 EST


On Fri, 2023-12-01 at 12:23 -0800, Ashok Raj wrote:
> On Fri, Dec 01, 2023 at 10:08:55AM -0800, Zhang, Rui wrote:
> > On Thu, 2023-11-30 at 19:25 -0800, Ashok Raj wrote:
> > > On Thu, Nov 23, 2023 at 12:50:47PM +0000, Zhang Rui wrote:
> > > > Hi, John,
> > > >
> > > > Thanks for catching this issue.
> > > >
> > > > On Wed, 2023-11-22 at 22:19 +0000, John Sperbeck wrote:
> > > > > I have a platform with both LOCAL_APIC and LOCAL_X2APIC
> > > > > entries
> > > > > for
> > > > > each CPU.  However, the ids for the LOCAL_APIC entries are
> > > > > all
> > > > > invalid ids of 255, so they have always been skipped in
> > > > > acpi_parse_lapic()
> > > > > by this code from f3bf1dbe64b6 ("x86/acpi: Prevent LAPIC id
> > > > > 0xff
> > > > > from
> > > > > being
> > > > > accounted"):
> > > > >
> > > > >     /* Ignore invalid ID */
> > > > >     if (processor->id == 0xff)
> > > > >             return 0;
> > > > >
> > > > > With the change in this thread, the return value of 0 means
> > > > > that
> > > > > the
> > > > > 'count' variable in acpi_parse_entries_array() is
> > > > > incremented. 
> > > > > The
> > > > > positive return value means that 'has_lapic_cpus' is set,
> > > > > even
> > > > > though
> > > > > no entries were actually matched.
> > > >
> > > > So in acpi_parse_madt_lapic_entries, without this patch,
> > > > madt_proc[0].count is a positive value on this platform, right?
> > > >
> > > > This sounds like a potential issue because the following checks
> > > > to
> > > > fall
> > > > back to MPS mode can also break. (If all LOCAL_APIC entries
> > > > have
> > > > apic_id 0xff and all LOCAL_X2APIC entries have apic_id
> > > > 0xffffffff)
> > > >
> > > > >   Then, when the MADT is iterated
> > > > > with acpi_parse_x2apic(), the x2apic entries with ids less
> > > > > than
> > > > > 255
> > > > > are skipped and most of my CPUs aren't recognized.
> > >
> > > This smells wrong. If a BIOS is placing some in lapic and some in
> > > x2apic
> > > table, its really messed up.
> > >
> > > Shouldn't the kernel scan them in some priority and only consider
> > > one
> > > set of
> > > tables?
> > >
> > > Shouldn't the code stop looking once something once a type is
> > > found?
> > >
> >
> > I also want to get this clarified but there is no spec saying this.
> > And
> > instead, as mentioned in the comment, we do have something in
> > https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#processor-local-x2apic-structure
> >
> > "[Compatibility note] On some legacy OSes, Logical processors with
> > APIC
> > ID values less than 255 (whether in XAPIC or X2APIC mode) must use
> > the
> > Processor Local APIC structure to convey their APIC information to
> > OSPM, and those processors must be declared in the DSDT using the
> > Processor() keyword. Logical processors with APIC ID values 255 and
> > greater must use the Processor Local x2APIC structure and be
> > declared
> > using the Device() keyword."
> >
> > so it is possible to enumerate CPUs from both LAPIC and X2APIC.
> >
>
> Ah, so this looks like the legacy case, old OS can atleast boot the
> APIC
> entries and not process the x2apic ones.
>
> So you can potentially have duplicates
>
> APIC = has all APIC id's < 255
> X2apic has all entries > 255 OR
>         It can contain everything, so you might need to weed out
>         duplicates?
>
That is what this patch tries to do.
Say, if we have valid CPUs in LAPIC, probe X2APIC CPUs with ID >= 255
only.

thanks,
rui