Re: [PATCH V2 1/2] perf/x86/intel/uncore: Skip discovery table for offline dies

From: Mi, Dapeng

Date: Tue Jan 13 2026 - 21:31:34 EST



On 1/14/2026 4:56 AM, Zide Chen wrote:
> This warning can be triggered if NUMA is disabled and the system
> boots with fewer CPUs than the number of CPUs in die 0.
>
> WARNING: CPU: 9 PID: 7257 at uncore.c:1157 uncore_pci_pmu_register+0x136/0x160 [intel_uncore]
>
> Currently, the discovery table continues to be parsed even if all CPUs
> in the associated die are offline. This can lead to an array overflow
> at "pmu->boxes[die] = box" in uncore_pci_pmu_register(), which may
> trigger the warning above or cause other issues.
>
> Reported-by: Steve Wahl <steve.wahl@xxxxxxx>
> Tested-by: Steve Wahl <steve.wahl@xxxxxxx>
> Fixes: edae1f06c2cd ("perf/x86/intel/uncore: Parse uncore discovery tables")
> Signed-off-by: Zide Chen <zide.chen@xxxxxxxxx>
> ---
> V2:
> - Add the Tested-by tag
> - Rebase onto perf/core (base commit: a491c02c2770)
>
> arch/x86/events/intel/uncore.c | 4 ++++
> arch/x86/events/intel/uncore_discovery.c | 2 +-
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
> index 4684649109d9..c126a29ab729 100644
> --- a/arch/x86/events/intel/uncore.c
> +++ b/arch/x86/events/intel/uncore.c
> @@ -1368,6 +1368,10 @@ static void uncore_pci_pmus_register(void)
>
> for (node = rb_first(type->boxes); node; node = rb_next(node)) {
> unit = rb_entry(node, struct intel_uncore_discovery_unit, node);
> +
> + if (WARN_ON(unit->die >= uncore_max_dies()))
> + continue;

I'm thinking if we need to add "WARN_ON" here. Since all uncore units that
the die id is larger than uncore_max_dies() would be skipped in discovery
phase, the unit die id should be not larger than uncore_max_dies() in
uncore_pci_pmus_register(). Is it right?


> +
> pdev = pci_get_domain_bus_and_slot(UNCORE_DISCOVERY_PCI_DOMAIN(unit->addr),
> UNCORE_DISCOVERY_PCI_BUS(unit->addr),
> UNCORE_DISCOVERY_PCI_DEVFN(unit->addr));
> diff --git a/arch/x86/events/intel/uncore_discovery.c b/arch/x86/events/intel/uncore_discovery.c
> index b46575254dbe..0e414cecb6f2 100644
> --- a/arch/x86/events/intel/uncore_discovery.c
> +++ b/arch/x86/events/intel/uncore_discovery.c
> @@ -366,7 +366,7 @@ static bool uncore_discovery_pci(struct uncore_discovery_domain *domain)
> (val & UNCORE_DISCOVERY_DVSEC2_BIR_MASK) * UNCORE_DISCOVERY_BIR_STEP;
>
> die = get_device_die_id(dev);
> - if (die < 0)
> + if ((die < 0) || (die >= uncore_max_dies()))
> continue;
>
> parse_discovery_table(domain, dev, die, bar_offset, &parsed);