Re: [PATCH v6 06/16] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()

From: Jonathan Cameron
Date: Wed Apr 17 2024 - 14:57:49 EST


On Wed, 17 Apr 2024 19:59:50 +0200
"Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:

> On Wed, Apr 17, 2024 at 7:09 PM Jonathan Cameron
> <Jonathan.Cameron@xxxxxxxxxx> wrote:
> >
> > On Wed, 17 Apr 2024 17:59:36 +0200
> > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:
> >
> > > On Wed, Apr 17, 2024 at 5:38 PM Jonathan Cameron
> > > <Jonathan.Cameron@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, 17 Apr 2024 16:03:51 +0100
> > > > Salil Mehta <salil.mehta@xxxxxxxxxx> wrote:
> > > >
> > > > > > From: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> > > > > > Sent: Wednesday, April 17, 2024 2:19 PM
> > > > > >
> > > > > > From: James Morse <james.morse@xxxxxxx>
> > > > > >
> > > > > > The arm64 specific arch_register_cpu() call may defer CPU registration until
> > > > > > the ACPI interpreter is available and the _STA method can be evaluated.
> > > > > >
> > > > > > If this occurs, then a second attempt is made in acpi_processor_get_info().
> > > > > > Note that the arm64 specific call has not yet been added so for now this will
> > > > > > be called for the original hotplug case.
> > > > > >
> > > > > > For architectures that do not defer until the ACPI Processor driver loads
> > > > > > (e.g. x86), for initially present CPUs there will already be a CPU device. If
> > > > > > present do not try to register again.
> > > > > >
> > > > > > Systems can still be booted with 'acpi=off', or not include an ACPI
> > > > > > description at all as in these cases arch_register_cpu() will not have
> > > > > > deferred registration when first called.
> > > > > >
> > > > > > This moves the CPU register logic back to a subsys_initcall(), while the
> > > > > > memory nodes will have been registered earlier.
> > > > > > Note this is where the call was prior to the cleanup series so there should be
> > > > > > no side effects of moving it back again for this specific case.
> > > > > >
> > > > > > [PATCH 00/21] Initial cleanups for vCPU HP.
> > > > > > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@xxxxxxxxxxxxxxxxxxxxx/
> > > > > >
> > > > > > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> > > > > >
> > > > > > Signed-off-by: James Morse <james.morse@xxxxxxx>
> > > > > > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx>
> > > > > > Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx>
> > > > > > Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx>
> > > > > > Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx>
> > > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
> > > > > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > > > > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > > > > ---
> > > > > > v6: Squash the two paths for conventional CPU Hotplug and arm64
> > > > > > vCPU HP.
> > > > > > v5: Update commit message to make it clear this is moving the
> > > > > > init back to where it was until very recently.
> > > > > >
> > > > > > No longer change the condition in the earlier registration point
> > > > > > as that will be handled by the arm64 registration routine
> > > > > > deferring until called again here.
> > > > > > ---
> > > > > > drivers/acpi/acpi_processor.c | 12 +++++++++++-
> > > > > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > index 7ecb13775d7f..0cac77961020 100644
> > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > @@ -356,8 +356,18 @@ static int acpi_processor_get_info(struct
> > > > > > acpi_device *device)
> > > > > > *
> > > > > > * NOTE: Even if the processor has a cpuid, it may not be present
> > > > > > * because cpuid <-> apicid mapping is persistent now.
> > > > > > + *
> > > > > > + * Note this allows 3 flows, it is up to the arch_register_cpu()
> > > > > > + * call to reject any that are not supported on a given architecture.
> > > > > > + * A) CPU becomes present.
> > > > > > + * B) Previously invalid logical CPU ID (Same as becoming present)
> > > > > > + * C) CPU already present and now being enabled (and wasn't
> > > > > > registered
> > > > > > + * early on an arch that doesn't defer to here)
> > > > > > */
> > > > > > - if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > > > > > + if ((!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > > > > + !get_cpu_device(pr->id)) ||
> > > > > > + invalid_logical_cpuid(pr->id) ||
> > > > > > + !cpu_present(pr->id)) {
> > > > >
> > > > >
> > > > Hi Salil,
> > > >
> > > > Thanks for quick review!
> > > >
> > > > > Logic is clear but it is ugly. We should turn them into macro or inline.
> > > >
> > > > You've found the 'ugly' in this approach vs keeping them separate.
> > > >
> > > > For this version I wanted to keep it clear that indeed this condition
> > > > is a complex mess of different things (and to let people compare
> > > > it easily with the two paths in v5 to convinced themselves this
> > > > is the same)
> > > >
> > > > It's also a little tricky to do, so will need some thought.
> > > >
> > > > I don't think a simple acpi_cpu_is_hotplug() condition is useful
> > > > as it just moves the complexity away from where a reader is looking
> > > > and it would only be used in this one case.
> > > >
> > > > It doesn't separate well into finer grained subconditions because
> > > > (C) is a messy case of the vCPU HP case and a not done
> > > > something else earlier. The disadvantage of only deferring for
> > > > arm64 and not other architectures.
> > > >
> > > > The best I can quickly come up with is something like this:
> > > > #define acpi_cpu_not_present(cpu) \
> > > > (invalid_logical_cpuid(cpu) || !cpu_present(cpu))
> > > > #define acpi_cpu_not_enabled(cpu) \
> > > > (!invalid_logical_cpuid(cpu) || cpu_present(cpu))
> > > >
> > > > if ((apci_cpu_not_enabled(pr->id) && !get_cpu_device(pr->id) ||
> > > > acpi_cpu_not_present(pr->id))
> > > >
> > > > Which would still need the same amount of documentation. The
> > > > code still isn't enough for me to immediately be able to see
> > > > what is going on.
> > > >
> > > > So maybe worth it... I'm not sure. Rafael, you get to keep this
> > > > fun, what would you prefer?
> > >
> > > I would use a static inline function returning bool to carry out these
> > > checks with comments explaining the different cases in which 'true'
> > > needs to be returned.
> >
> > The following makes a subtle logic change (I'll retest tomorrow) but
> > I think that get_cpu_device(cpu) can never succeed in a path where
> > hotadd makes sense.
> >
> > +/*
> > + * Identify if the state transition indicates that hotadd_init
> > + * should be called.
> > + *
> > + * For acpi_processor_add() to be called, the reported state must
> > + * now be enabled and present. Conditions reflect prior state.
> > + */
> > +static inline bool acpi_processor_should_hotadd_init(int cpu)
> > +{
> > + /* Already register, initial registration was not deferred */
>
> "Already registered." I think.
>
> > + if (get_cpu_device(cpu))
> > + return false;
> > +
> > + /* Processor has become present */
> > + if (!cpu_present(cpu))
> > + return true;
> > +
> > + /* Logical cpuid currently invalid indicates hotadd */
> > + if (invalid_logical_cpuid(cpu))
> > + return true;
> > +
> > + /*
> > + * Previously present and the logical cpu id is valid.
> > + * Deferred registration now _STA can be queries, or
> > + * Hotadd due to enabled becoming true on an online capable
> > + * CPU.
> > + */
> > + if (cpu_present(cpu))
> > + return true;
>
> It returns true for both the cpu_present(cpu) and !cpu_present(cpu)
> cases, so will it ever return false except for when
> get_cpu_device(cpu) returns true?

It indeed looks suspicious. My logic is probably wrong. Will check
- I guess maybe pulling out the get_cpu_device(cpu) indeed flattens
this as you point out. Kind of makes sense if true.

Jonathan

>
> > +
> > + return false;
> > +}
> > +
> > static int acpi_processor_get_info(struct acpi_device *device)
> > {
> > union acpi_object object = { 0 };
> > @@ -356,18 +388,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > *
> > * NOTE: Even if the processor has a cpuid, it may not be present
> > * because cpuid <-> apicid mapping is persistent now.
> > - *
> > - * Note this allows 3 flows, it is up to the arch_register_cpu()
> > - * call to reject any that are not supported on a given architecture.
> > - * A) CPU becomes present.
> > - * B) Previously invalid logical CPU ID (Same as becoming present)
> > - * C) CPU already present and now being enabled (and wasn't registered
> > - * early on an arch that doesn't defer to here)
> > */
> > - if ((!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > - !get_cpu_device(pr->id)) ||
> > - invalid_logical_cpuid(pr->id) ||
> > - !cpu_present(pr->id)) {
> > + if (acpi_processor_should_hotadd_init(pr->id)) {
> > ret = acpi_processor_hotadd_init(pr, device);
> >