Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance

From: Jonathan Cameron
Date: Tue Apr 16 2024 - 13:35:22 EST


On Mon, 15 Apr 2024 19:41:43 +0200
"Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:

> On Mon, Apr 15, 2024 at 7:35 PM Jonathan Cameron
> <Jonathan.Cameron@xxxxxxxxxx> wrote:
> >
> > On Mon, 15 Apr 2024 17:50:57 +0100
> > Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote:
> >
> > > On Mon, 15 Apr 2024 18:19:17 +0200
> > > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:
> > >
> > > > On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> > > > > <Jonathan.Cameron@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Fri, 12 Apr 2024 20:10:54 +0200
> > > > > > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > > > > > <Jonathan.Cameron@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > > > > > method of the DSDT object so make it available by assigning the
> > > > > > > > appropriate handle to the struct cpu instance.
> > > > > > > >
> > > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/acpi/acpi_processor.c | 3 +++
> > > > > > > > 1 file changed, 3 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > > > union acpi_object object = { 0 };
> > > > > > > > struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > > > > > > struct acpi_processor *pr = acpi_driver_data(device);
> > > > > > > > + struct cpu *c;
> > > > > > > > int device_declaration = 0;
> > > > > > > > acpi_status status = AE_OK;
> > > > > > > > static int cpu0_initialized;
> > > > > > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > > > cpufreq_add_device("acpi-cpufreq");
> > > > > > > > }
> > > > > > > >
> > > > > > > > + c = &per_cpu(cpu_devices, pr->id);
> > > > > > > > + ACPI_COMPANION_SET(&c->dev, device);
> > > > > > >
> > > > > > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > > > > > acpi_processor_add(), via acpi_bind_one().
> > > > > >
> > > > > > Hi Rafael,
> > > > > >
> > > > > > cpu_sys_devices gets filled with a pointer to this same structure.
> > > > > > The contents gets set in register_cpu() so at this point
> > > > > > it doesn't point anywhere. As a side note register_cpu()
> > > > > > memsets to zero the value I set it to in the code above which isn't
> > > > > > great, particularly as I want to use this in post_eject for
> > > > > > arm64.
> > > > > >
> > > > > > We could make a copy of the handle and put it back after
> > > > > > the memset in register_cpu() but that is also ugly.
> > > > > > It's the best I've come up with to make sure this is still set
> > > > > > come remove time but is rather odd.
> > > > > > >
> > > > > > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > > > > > it seems premature to use it here this way.
> > > > > > >
> > > > > > > I think that ACPI_COMPANION_SET() should be called from here on
> > > > > > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > > > > > pr->id validation should all be done here) and then NULL can be passed
> > > > > > > as acpi_dev to acpi_bind_one() in acpi_processor_add(). Then, there
> > > > > > > will be one physical device corresponding to the processor ACPI device
> > > > > > > and no confusion.
> > > > > >
> > > > > > I'm fairly sure this is pointing to the same device but agreed this
> > > > > > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > > > > > so I'm not immediately seeing a cleaner solution :(
> > > > >
> > > > > Well, OK.
> > > > >
> > > > > Please at least consider doing the pr->id validation checks before
> > > > > setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
> > > > >
> > > > > Also, acpi_bind_one() needs to be called on the "physical" devices
> > > > > passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> > > > > the reference counting and physical device lookup to work.
> > > > >
> > > > > Please also note that acpi_primary_dev_companion() should return
> > > > > per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> > > > > depends on the order of acpi_bind_one() calls involving the same ACPI
> > > > > device.
> > > >
> > > > Of course, if the value set by ACPI_COMPANION_SET() is cleared
> > > > subsequently, the above is not needed, but then using
> > > > ACPI_COMPANION_SET() is questionable overall.
> > >
> > > Agreed + smoothing over that by stashing and putting it back doesn't
> > > work because there is an additional call to acpi_bind_one() inbetween
> > > here and the one you reference.
> > >
> > > The arch_register_cpu() calls end up calling register_cpu() /
> > > device_register() / acpi_device_notify() / acpi_bind_one()
> > >
> > > With current code that fails (silently)
>
> And that's why there is an explicit acpi_bind_one() invocation in
> acpi_processor_add().
>
> > > If I make sure the handle is set before register_cpu() then it
> > > succeeds, but we end up with duplicate sysfs files etc because we
> > > bind twice.
>
> Right, I should have recalled that earlier.
>
> > > I think the only way around this is larger reorganization of the
> > > CPU hotplug code to pull the arch_register_cpu() call to where
> > > the acpi_bind_one() call is. However that changes a lot more than I'd like
> > > (and I don't have it working yet).
>
> I see.
>
> > > Alternatively find somewhere else to stash the handle, or just add it as
> > > a parameter to arch_register_cpu(). Right now this feels the easier
> > > path to me. arch_register_cpu(int cpu, acpi_handle handle)
> > >
> > > Would that be a path you'd consider?
> >
> > Another option would be to do the per_cpu(processors, pr->id) = pr
> > a few lines earlier than currently and access that directly from the
> > arch_register_cpu() call. Similarly remove that reference a bit later and
> > use it in arch_unregister_cpu().
> >
> > This seems like the simplest solution, but I may be missing something.
>
> This should work AFAICS, but I'd move the entire piece of code between
> BUG_ON() and setting per_cpu(processors, pr->id) inclusive:

Hi Rafael,

Unfortunately this is more complex on x86 than I realized :(

On x86 the initial pr->id is invalid, which is one of the conditions
that leads to acpi_processor_hotadd_init() being called.
It only become valid after acpi_map_cpu() in acpi_processor_hotadd_init().

So the best I can immediately come up with is to factor out these checks and the
setting of the per_cpu structures and set them either in acpi_processor_hotadd_init()
or in an else for the non hotplug / normal registration path (where the pr->id is valid).

Naturally found this on my final set of tests...

A little ugly but not 'too bad'.

Jonathan
p.s. No one minds if I break x86, right?





>
> BUG_ON(pr->id >= nr_cpu_ids);
>
> /*
> * Buggy BIOS check.
> * ACPI id of processors can be reported wrongly by the BIOS.
> * Don't trust it blindly
> */
> if (per_cpu(processor_device_array, pr->id) != NULL &&
> per_cpu(processor_device_array, pr->id) != device) {
> dev_warn(&device->dev,
> "BIOS reported wrong ACPI id %d for the processor\n",
> pr->id);
> /* Give up, but do not abort the namespace scan. */
> goto err;
> }
> /*
> * processor_device_array is not cleared on errors to allow buggy BIOS
> * checks.
> */
> per_cpu(processor_device_array, pr->id) = device;
> per_cpu(processors, pr->id) = pr;
>
> into acpi_processor_get_info(), right after the point where pr->id is set.