Re: [PATCH RFC v3 03/21] ACPI: processor: Register CPUs that are online, but not described in the DSDT

From: Russell King (Oracle)
Date: Mon Jan 15 2024 - 06:06:58 EST


On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote:
> On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@xxxxxxxxxxxxxxx> wrote:
> >
> > From: James Morse <james.morse@xxxxxxx>
> >
> > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other
> > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
> > says "Each processor in the system must be declared in the ACPI
> > namespace"). Having two descriptions allows firmware authors to get
> > this wrong.
> >
> > If CPUs are described in the MADT/APIC, they will be brought online
> > early during boot. Once the register_cpu() calls are moved to ACPI,
> > they will be based on the DSDT description of the CPUs. When CPUs are
> > missing from the DSDT description, they will end up online, but not
> > registered.
> >
> > Add a helper that runs after acpi_init() has completed to register
> > CPUs that are online, but weren't found in the DSDT. Any CPU that
> > is registered by this code triggers a firmware-bug warning and kernel
> > taint.
> >
> > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
> > is configured.
>
> So why is this a kernel problem?

So what are you proposing should be the behaviour here? What this
statement seems to be saying is that QEMU as it exists today only
describes the first CPU in DSDT.

As this patch series changes when arch_register_cpu() gets called (as
described in the paragraph above) we obviously need to preserve the
_existing_ behaviour to avoid causing regressions. So, if changing the
kernel causes user visible regressions (e.g. sysfs entries to
disappear) then it obviously _is_ a kernel problem that needs to be
solved.

We can't say "well fix QEMU then" without invoking the wrath of Linus.

> > Signed-off-by: James Morse <james.morse@xxxxxxx>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx>
> > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx>
> > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
> > ---
> > drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
> > 1 file changed, 19 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 6a542e0ce396..0511f2bc10bc 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -791,6 +791,25 @@ void __init acpi_processor_init(void)
> > acpi_pcc_cpufreq_init();
> > }
> >
> > +static int __init acpi_processor_register_missing_cpus(void)
> > +{
> > + int cpu;
> > +
> > + if (acpi_disabled)
> > + return 0;
> > +
> > + for_each_online_cpu(cpu) {
> > + if (!get_cpu_device(cpu)) {
> > + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
> > + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> > + arch_register_cpu(cpu);
>
> Which part of this code is related to ACPI?

That's a good question, and I suspect it would be more suited to being
placed in drivers/base/cpu.c except for the problem that the error
message refers to ACPI.

As long as we keep the acpi_disabled test, I guess that's fine.
cpu_dev_register_generic() there already tests acpi_disabled.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!