Re: [PATCH V6 4/5] perf/x86/intel/uncore: Fix PMON enumeration with NUMA disabled

From: Chen, Zide

Date: Wed Apr 01 2026 - 16:29:25 EST




On 3/30/2026 6:26 PM, Mi, Dapeng wrote:
>
> On 3/31/2026 5:24 AM, Zide Chen wrote:
>> When NUMA is disabled on a NUMA-capable platform, UPI and M3UPI PMON
>> units are not enumerated.
>>
>> In this case, pcibus_to_node() always returns NUMA_NO_NODE, causing
>> uncore_device_to_die() to return -1 for all PCI devices. As a result,
>> the corresponding PMON units are not added to the RB tree.
>>
>> These PMON units are per-die resources, and their utility when NUMA is
>> disabled is limited. The driver does not prohibit their use, and the
>> enumeration should still work correctly.
>>
>> Fix this by using uncore_pcibus_to_dieid(), which works regardless of
>> whether NUMA is enabled. This requires calling
>> snbep_pci2phy_map_init() in spr_uncore_pci_init().
>>
>> Since pci_init() is called before mmio_init(), remove the redundant
>> snbep_pci2phy_map_init() call from spr_uncore_mmio_init(). If
>> snbep_pci2phy_map_init() fails, uncore driver should be bailed out,
>> so the fallback path in spr_uncore_mmio_init() can be removed.
>>
>> Signed-off-by: Zide Chen <zide.chen@xxxxxxxxx>
>> ---
>> V6:
>> - Split from patch v5 3/4.
>> - Remove the redundant call in spr_uncore_mmio_init().
>> - Update commit messages.
>> ---
>> arch/x86/events/intel/uncore.c | 1 +
>> arch/x86/events/intel/uncore_snbep.c | 26 +++++++++++---------------
>> 2 files changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>> index 786bd51a0d89..e9cc1ba921c5 100644
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>> return bus ? pci_domain_nr(bus) : -EINVAL;
>> }
>>
>> +/* Note: This API can only be used when NUMA information is available. */
>> int uncore_device_to_die(struct pci_dev *dev)
>> {
>> int node = pcibus_to_node(dev->bus);
>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>> index 8ee06d4659bb..73da1e88e286 100644
>> --- a/arch/x86/events/intel/uncore_snbep.c
>> +++ b/arch/x86/events/intel/uncore_snbep.c
>> @@ -6415,7 +6415,7 @@ static void spr_update_device_location(int type_id)
>>
>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>
>> - die = uncore_device_to_die(dev);
>> + die = uncore_pcibus_to_dieid(dev->bus);
>> if (die < 0)
>> continue;
>>
>> @@ -6439,6 +6439,10 @@ static void spr_update_device_location(int type_id)
>>
>> int spr_uncore_pci_init(void)
>> {
>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> + if (ret)
>> + return ret;
>> +
>> /*
>> * The discovery table of UPI on some SPR variant is broken,
>> * which impacts the detection of both UPI and M3UPI uncore PMON.
>> @@ -6460,21 +6464,13 @@ int spr_uncore_pci_init(void)
>>
>> void spr_uncore_mmio_init(void)
>> {
>> - int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> + uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>> + UNCORE_SPR_MMIO_EXTRA_UNCORES,
>> + spr_mmio_uncores,
>> + UNCORE_SPR_NUM_UNCORE_TYPES,
>> + spr_uncores);
>>
>> - if (ret) {
>> - uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL,
>> - UNCORE_SPR_NUM_UNCORE_TYPES,
>> - spr_uncores);
>> - } else {
>> - uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
>> - UNCORE_SPR_MMIO_EXTRA_UNCORES,
>> - spr_mmio_uncores,
>> - UNCORE_SPR_NUM_UNCORE_TYPES,
>> - spr_uncores);
>> -
>> - spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>> - }
>> + spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
>
> I'm not sure if we can directly remove the snbep_pci2phy_map_init() call
> here. In theory, the snbep_pci2phy_map_init() call in spr_uncore_pci_init()
> could fail and then spr_uncore_mmio_init() doesn't know it and directly
> initializes MMIO PMU, then it could lead to the MMIO initialization fails.


Yes, this is true. But I would argue that the fix in this patch is
correct, and the issue you pointed out is not new: the uncore driver
registers a PMU device without guaranteeing it's functioning.

This is because the Intel uncore driver employs a lazy init approach.
And when init_box() fails, it doesn't unregister the inaccessible PMU
devices. For example, intel_generic_uncore_mmio_init_box() could fail
for a number of reasons, making all associated PMU devices non-functional.

Originally the uncore driver tried to enumerate PCI/MSR/MMIO uncore
independently, but evolving hardware complexity makes this more
challenging. This patch is just one example, IMC Freerunning is
MMIO-accessed but relies on PCI devices to read the die-specific MMIO
base address. Explicitly gating sysfs node creation with PCI init code
in mmio_init() is neither clean nor reliable.

To fix it, it seems reasonable to have init_box() return int and
unregister the PMU device if deemed inaccessible — similar to what
perf_event_ibs_init() does.

--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -129,7 +129,7 @@ struct intel_uncore_type {
#define events_group attr_groups[2]

struct intel_uncore_ops {
- void (*init_box)(struct intel_uncore_box *);
+ int (*init_box)(struct intel_uncore_box *);

--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1155,7 +1155,8 @@ static int uncore_pci_pmu_register(struct pci_dev
*pdev,
box->dieid = die;
box->pci_dev = pdev;
box->pmu = pmu;
- uncore_box_init(box);
+ ret = uncore_box_init(box);
+ if (ret)
+ return ret;

@@ -1598,8 +1599,10 @@ static int uncore_box_ref(struct
intel_uncore_type **types,
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[id];
- if (box && box->cpu >= 0 &&
atomic_inc_return(&box->refcnt) == 1)
- uncore_box_init(box);
+ if (box && box->cpu >= 0 &&
atomic_inc_return(&box->refcnt) == 1)
+ if (uncore_box_init(box))
+ uncore_pmu_unregister(pmu);


> Currently the PCI, CPU and MMIO initialization are totally independent,
> only when the 3 types initialization all fail, then uncore PMU can abort.
>
> ``` 
>
>    if (uncore_init->pci_init) {
>         pret = uncore_init->pci_init();
>         if (!pret)
>             pret = uncore_pci_init();
>     }
>
>     if (uncore_init->cpu_init) {
>         uncore_init->cpu_init();
>         cret = uncore_cpu_init();
>     }
>
>     if (uncore_init->mmio_init) {
>         uncore_init->mmio_init();
>         mret = uncore_mmio_init();
>     }
>
>     if (cret && pret && mret) {
>         ret = -ENODEV;
>         goto free_discovery;
>     }
> ```
>
>
>> }
>>
>> /* end of SPR uncore support */