Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()

From: Russell King (Oracle)
Date: Fri Apr 12 2024 - 16:16:42 EST


On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> <Jonathan.Cameron@xxxxxxxxxx> wrote:
> >
> > From: James Morse <james.morse@xxxxxxx>
> >
> > The arm64 specific arch_register_cpu() call may defer CPU registration
> > until the ACPI interpreter is available and the _STA method can
> > be evaluated.
> >
> > If this occurs, then a second attempt is made in
> > acpi_processor_get_info(). Note that the arm64 specific call has
> > not yet been added so for now this will never be successfully
> > called.
> >
> > Systems can still be booted with 'acpi=off', or not include an
> > ACPI description at all as in these cases arch_register_cpu()
> > will not have deferred registration when first called.
> >
> > This moves the CPU register logic back to a subsys_initcall(),
> > while the memory nodes will have been registered earlier.
> > Note this is where the call was prior to the cleanup series so
> > there should be no side effects of moving it back again for this
> > specific case.
> >
> > [PATCH 00/21] Initial cleanups for vCPU HP.
> > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@xxxxxxxxxxxxxxxxxxxxx/
> >
> > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> >
> > Signed-off-by: James Morse <james.morse@xxxxxxx>
> > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx>
> > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx>
> > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > ---
> > v5: Update commit message to make it clear this is moving the
> > init back to where it was until very recently.
> >
> > No longer change the condition in the earlier registration point
> > as that will be handled by the arm64 registration routine
> > deferring until called again here.
> > ---
> > drivers/acpi/acpi_processor.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 93e029403d05..c78398cdd060 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >
> > c = &per_cpu(cpu_devices, pr->id);
> > ACPI_COMPANION_SET(&c->dev, device);
> > + /*
> > + * Register CPUs that are present. get_cpu_device() is used to skip
> > + * duplicate CPU descriptions from firmware.
> > + */
> > + if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > + !get_cpu_device(pr->id)) {
> > + int ret = arch_register_cpu(pr->id);
> > +
> > + if (ret)
> > + return ret;
> > + }
> > +
> > /*
> > * Extra Processor objects may be enumerated on MP systems with
> > * less than the max # of CPUs. They should be ignored _iff
> > --
>
> I am still unsure why there need to be two paths calling
> arch_register_cpu() in acpi_processor_get_info().
>
> Just below the comment partially pulled into the patch context above,
> there is this code:
>
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> int ret = acpi_processor_hotadd_init(pr);
>
> if (ret)
> return ret;
> }
>
> For the sake of the argument, fold acpi_processor_hotadd_init() into
> it and drop the redundant _STA check from it:
>
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> if (invalid_phys_cpuid(pr->phys_id))
> return -ENODEV;
>
> cpu_maps_update_begin();
> cpus_write_lock();
>
> ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> if (ret) {
> cpus_write_unlock();
> cpu_maps_update_done();
> return ret;
> }
> ret = arch_register_cpu(pr->id);
> if (ret) {
> acpi_unmap_cpu(pr->id);
>
> cpus_write_unlock();
> cpu_maps_update_done();
> return ret;
> }
> pr_info("CPU%d has been hot-added\n", pr->id);
> pr->flags.need_hotplug_init = 1;
>
> cpus_write_unlock();
> cpu_maps_update_done();
> }
>
> so I'm not sure why this cannot be combined with the new code.
>
> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> What's the difference then? The locking, which should be fine if I'm
> not mistaken and need_hotplug_init that needs to be set if this code
> runs after the processor driver has loaded AFAICS.

It is over this that I walked away from progressing this code, because
I don't think it's quite as simple as you make it out to be.

Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch implemented
functions, so Arm64 can easily provide stubs for these that do nothing.
That never caused me any concern.

What does cause me great concern though are the finer details. For
example, above you seem to drop the evaluation of _STA for the
"make_present" case - I've no idea whether that is something that
should be deleted or not (if it is something that can be deleted,
then why not delete it now?)

As for the cpu locking, I couldn't find anything in arch_register_cpu()
that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
being taken - so I've no idea why the "make_present" case takes these
locks.

Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
obvious that this is required - remember that with Arm64's "enabled"
toggling, the "processor" is a slice of the system and doesn't
actually go away - it's just "not enabled" for use.

Again, as "processors" in Arm64 are slices of the system, they have
to be fully described in ACPI before the OS boots, and they will be
marked as being "present", which means they will be enumerated, and
the driver will be probed. Any processor that is not to be used will
not have its enabled bit set. It is my understanding that every
processor will result in the ACPI processor driver being bound to it
whether its enabled or not.

The difference between real hotplug and Arm64 hotplug is that real
hotplug makes stuff not-present (and thus unenumerable). Arm64 hotplug
makes stuff not-enabled which is still enumerable.

.. or at least that is my understanding which may not be entirely
correct (which is why I stepped down because I feel totally out of
my depth with ACPI stuff.)

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!