Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
From: Chen, Zide
Date: Thu Mar 26 2026 - 20:02:14 EST
On 3/25/2026 11:03 PM, Mi, Dapeng wrote:
> Zide, Sashiko gave some comments on this patch. Could you please have a
> look if they are reasonable? Thanks.
>
> https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
1. Regarding the concern that this change may replace an offline node's
-1 with the die ID of an adjacent online node, I do not think this is an
issue.
After this fix, the logic is the same for both (nr_node_ids <= 8) and
(nr_node_ids > 8): map->pbus_to_dieid[bus] may be written with an
invalid die_id (e.g., -1). This is not an error and is expected in some
cases. We should continue to populate the map->pbus_to_dieid[] array.
Regardless of the traversal order (as determined by the reverse
argument), for a given die, the UBOX device is expected to reside on the
first valid bus in the die it is affined to.
Under the current assignment algorithm, all buses following a UBOX
device, up to the next UBOX device or the end of traversal, are assigned
the same die ID.
For example, on SPR, there are two UBOX devices: one device on bus 0x7e
in die 0, and another on bus 0xfe in die 1. With reversed traversal
order, buses 0xff–0x7f are assigned die ID 1, while buses 0x7e–0x00 are
assigned die ID 0.
If all CPUs in die 1 are offline, then buses 0xff–0x7f are assigned -1.
This is fine.
That being said, the die ID for invalid buses is not consistent, which
is not ideal.
2. Regarding the repeated snbep_pci2phy_map_init() calls. I wanted a
"simple" fix initially. I may need to split this patch into two
separate patches.
>
> On 3/25/2026 5:49 AM, Zide Chen wrote:
>> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
>> uncore_device_to_die() may return -1 when all CPUs associated
>> with the UBOX device are offline.
>>
>> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>>
>> - The current code breaks out of the loop. This is incorrect because
>> pci_get_device() does not guarantee iteration in domain or bus order,
>> so additional UBOX devices may be skipped during the scan.
>>
>> - Returning -EINVAL is incorrect, since marking offline buses with
>> die_id == -1 is expected and should not be treated as an error.
>>
>> Separately, when NUMA is disabled on a NUMA-capable platform,
>> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
>> to return -1 for all PCI devices. As a result,
>> spr_update_device_location(), used on Intel SPR and EMR, ignores the
>> corresponding PMON units and does not add them to the RB tree.
>>
>> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
>> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
>> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
>> added in spr_uncore_pci_init().
>>
>> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
>> NUMA is expected to be enabled.
>>
>> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
>> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
>> Tested-by: Steve Wahl <steve.wahl@xxxxxxx>
>> Signed-off-by: Zide Chen <zide.chen@xxxxxxxxx>
>> ---
>> V2:
>> - Fix the commit message to note that spr_update_device_location() is
>> used by EMR, not GNR.
>> - Rewrite the commit message for clarity.
>> - Add a Tested-by tag.
>>
>> V5:
>> - Remove unused variable die_id (Dapeng).
>> ---
>> arch/x86/events/intel/uncore.c | 1 +
>> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>> index 786bd51a0d89..e9cc1ba921c5 100644
>> --- a/arch/x86/events/intel/uncore.c
>> +++ b/arch/x86/events/intel/uncore.c
>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>> return bus ? pci_domain_nr(bus) : -EINVAL;
>> }
>>
>> +/* Note: This API can only be used when NUMA information is available. */
>> int uncore_device_to_die(struct pci_dev *dev)
>> {
>> int node = pcibus_to_node(dev->bus);
>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>> index 9b51883fd6fd..5ef205a70559 100644
>> --- a/arch/x86/events/intel/uncore_snbep.c
>> +++ b/arch/x86/events/intel/uncore_snbep.c
>> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>> {
>> struct pci_dev *ubox_dev = NULL;
>> - int i, bus, nodeid, segment, die_id;
>> + int i, bus, nodeid, segment;
>> struct pci2phy_map *map;
>> int err = 0;
>> u32 config = 0;
>> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>> break;
>> }
>>
>> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
>> -
>> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>> raw_spin_unlock(&pci2phy_map_lock);
>> -
>> - if (WARN_ON_ONCE(die_id == -1)) {
>> - err = -EINVAL;
>> - break;
>> - }
>> }
>> }
>>
>> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>>
>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>
>> - die = uncore_device_to_die(dev);
>> + die = uncore_pcibus_to_dieid(dev->bus);
>> if (die < 0)
>> continue;
>>
>> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>>
>> int spr_uncore_pci_init(void)
>> {
>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>> +
>> + if (ret)
>> + return ret;
>> +
>> /*
>> * The discovery table of UPI on some SPR variant is broken,
>> * which impacts the detection of both UPI and M3UPI uncore PMON.