Re: [PATCH RFC v3 15/21] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
From: Jonathan Cameron
Date: Fri Dec 15 2023 - 11:38:59 EST
On Wed, 13 Dec 2023 12:50:28 +0000
Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx> wrote:
> From: James Morse <james.morse@xxxxxxx>
>
> To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> to the MADT GICC entries. This indicates a disabled CPU entry may not
> be possible to online via PSCI until firmware has set enabled bit in
> _STA.
>
> What about the redistributor in the GICC entry? ACPI doesn't want to say.
> Assume the worst: When a redistributor is described in the GICC entry,
> but the entry is marked as disabled at boot, assume the redistributor
> is inaccessible.
>
> The GICv3 driver doesn't support late online of redistributors, so this
> means the corresponding CPU can't be brought online either. Clear the
> possible and present bits.
>
> Systems that want CPU hotplug in a VM can ensure their redistributors
> are always-on, and describe them that way with a GICR entry in the MADT.
>
> When mapping redistributors found via GICC entries, handle the case
> where the arch code believes the CPU is present and possible, but it
> does not have an accessible redistributor. Print a warning and clear
> the present and possible bits.
>
> Signed-off-by: James Morse <james.morse@xxxxxxx>
> Tested-by: Miguel Luis <miguel.luis@xxxxxxxxxx>
> Tested-by: Vishnu Pajjuri <vishnu@xxxxxxxxxxxxxxxxxxxxxx>
> Tested-by: Jianyong Wu <jianyong.wu@xxxxxxx>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@xxxxxxxxxxxxxxx>
Seems resonable, but this contains the blob that makes the change I called
out in the previous patch relevant. With a forwards reference in that patch.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> ----
> Disabled but online-capable CPUs cause this message to be printed
> if their redistributors are described via GICC:
> | GICv3: CPU 3's redistributor is inaccessible: this CPU can't be brought online
>
> If ACPI's _STA tries to make the cpu present later, this message is printed:
> | Changing CPU present bit is not supported
>
> Changes since RFC v2:
> * use gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_CPU_CAPABLE)
> ---
> drivers/irqchip/irq-gic-v3.c | 14 ++++++++++++++
> include/linux/acpi.h | 2 +-
> 2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index ebecd4546830..6d0f98d3540e 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2370,11 +2370,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> (struct acpi_madt_generic_interrupt *)header;
> u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> + int cpu = get_cpu_for_acpi_id(gicc->uid);
> void __iomem *redist_base;
>
> if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> + /*
> + * Capable but disabled CPUs can be brought online later. What about
> + * the redistributor? ACPI doesn't want to say!
> + * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> + * Otherwise, prevent such CPUs from being brought online.
> + */
> + if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> + pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> + set_cpu_present(cpu, false);
> + set_cpu_possible(cpu, false);
> + return 0;
> + }
> +
> redist_base = ioremap(gicc->gicr_base_address, size);
> if (!redist_base)
> return -ENOMEM;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 19d009ca9e7a..00be66683505 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -238,7 +238,7 @@ void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>
> static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> {
> - return gicc->flags & ACPI_MADT_ENABLED;
> + return gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_CPU_CAPABLE);
This is where the change is made that broke the code path in
the previous patch. No problem with splitting that across patches but maybe call out
why in the patch intro for previous patch.
> }
>
> /* the following numa functions are architecture-dependent */