Re: [PATCH V5 3/4] perf/x86/intel/uncore: Fix die ID init and look up bugs
From: Mi, Dapeng
Date: Thu Mar 26 2026 - 22:04:00 EST
On 3/27/2026 7:57 AM, Chen, Zide wrote:
>
> On 3/25/2026 11:03 PM, Mi, Dapeng wrote:
>> Zide, Sashiko gave some comments on this patch. Could you please have a
>> look if they are reasonable? Thanks.
>>
>> https://sashiko.dev/#/patchset/20260324214932.10068-1-zide.chen%40intel.com
> 1. Regarding the concern that this change may replace an offline node's
> -1 with the die ID of an adjacent online node, I do not think this is an
> issue.
>
> After this fix, the logic is the same for both (nr_node_ids <= 8) and
> (nr_node_ids > 8): map->pbus_to_dieid[bus] may be written with an
> invalid die_id (e.g., -1). This is not an error and is expected in some
> cases. We should continue to populate the map->pbus_to_dieid[] array.
>
> Regardless of the traversal order (as determined by the reverse
> argument), for a given die, the UBOX device is expected to reside on the
> first valid bus in the die it is affined to.
>
> Under the current assignment algorithm, all buses following a UBOX
> device, up to the next UBOX device or the end of traversal, are assigned
> the same die ID.
>
> For example, on SPR, there are two UBOX devices: one device on bus 0x7e
> in die 0, and another on bus 0xfe in die 1. With reversed traversal
> order, buses 0xff–0x7f are assigned die ID 1, while buses 0x7e–0x00 are
> assigned die ID 0.
>
> If all CPUs in die 1 are offline, then buses 0xff–0x7f are assigned -1.
> This is fine.
>
> That being said, the die ID for invalid buses is not consistent, which
> is not ideal.
Yes, for the case with 2 sockets and socket 1 is offline, it's correct. But
assume there are 4 sockets (0/1/2/3), buses 0x0-0x3f are attached to socket
0, buses 0x40-0x7f are attached to socket 1, buses 0x80-0xbf are attached
to socket 2 and buses 0xc0-0xff are attached to socket 3, the socket 2 is
offline. In reverse order, the die id of buses 0x80-0xbf would be
overwritten to 3 instead of -1, right?
But it seems there is not a good way to fix this issue and the function
spr_update_device_location() won't really find the ubox device of socket 2
since socket 2 has been offline. So it won't cause a real issue.
>
> 2. Regarding the repeated snbep_pci2phy_map_init() calls. I wanted a
> "simple" fix initially. I may need to split this patch into two
> separate patches.
>
>> On 3/25/2026 5:49 AM, Zide Chen wrote:
>>> In snbep_pci2phy_map_init(), in the nr_node_ids > 8 path,
>>> uncore_device_to_die() may return -1 when all CPUs associated
>>> with the UBOX device are offline.
>>>
>>> Remove the WARN_ON_ONCE(die_id == -1) check for two reasons:
>>>
>>> - The current code breaks out of the loop. This is incorrect because
>>> pci_get_device() does not guarantee iteration in domain or bus order,
>>> so additional UBOX devices may be skipped during the scan.
>>>
>>> - Returning -EINVAL is incorrect, since marking offline buses with
>>> die_id == -1 is expected and should not be treated as an error.
>>>
>>> Separately, when NUMA is disabled on a NUMA-capable platform,
>>> pcibus_to_node() returns NUMA_NO_NODE, causing uncore_device_to_die()
>>> to return -1 for all PCI devices. As a result,
>>> spr_update_device_location(), used on Intel SPR and EMR, ignores the
>>> corresponding PMON units and does not add them to the RB tree.
>>>
>>> Fix this by using uncore_pcibus_to_dieid(), which retrieves topology
>>> from the UBOX GIDNIDMAP register and works regardless of whether NUMA
>>> is enabled in Linux. This requires snbep_pci2phy_map_init() to be
>>> added in spr_uncore_pci_init().
>>>
>>> Keep uncore_device_to_die() only for the nr_node_ids > 8 case, where
>>> NUMA is expected to be enabled.
>>>
>>> Fixes: 9a7832ce3d92 ("perf/x86/intel/uncore: With > 8 nodes, get pci bus die id from NUMA info")
>>> Fixes: 65248a9a9ee1 ("perf/x86/uncore: Add a quirk for UPI on SPR")
>>> Tested-by: Steve Wahl <steve.wahl@xxxxxxx>
>>> Signed-off-by: Zide Chen <zide.chen@xxxxxxxxx>
>>> ---
>>> V2:
>>> - Fix the commit message to note that spr_update_device_location() is
>>> used by EMR, not GNR.
>>> - Rewrite the commit message for clarity.
>>> - Add a Tested-by tag.
>>>
>>> V5:
>>> - Remove unused variable die_id (Dapeng).
>>> ---
>>> arch/x86/events/intel/uncore.c | 1 +
>>> arch/x86/events/intel/uncore_snbep.c | 17 ++++++++---------
>>> 2 files changed, 9 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
>>> index 786bd51a0d89..e9cc1ba921c5 100644
>>> --- a/arch/x86/events/intel/uncore.c
>>> +++ b/arch/x86/events/intel/uncore.c
>>> @@ -67,6 +67,7 @@ int uncore_die_to_segment(int die)
>>> return bus ? pci_domain_nr(bus) : -EINVAL;
>>> }
>>>
>>> +/* Note: This API can only be used when NUMA information is available. */
>>> int uncore_device_to_die(struct pci_dev *dev)
>>> {
>>> int node = pcibus_to_node(dev->bus);
>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>> index 9b51883fd6fd..5ef205a70559 100644
>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>> @@ -1413,7 +1413,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
>>> static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
>>> {
>>> struct pci_dev *ubox_dev = NULL;
>>> - int i, bus, nodeid, segment, die_id;
>>> + int i, bus, nodeid, segment;
>>> struct pci2phy_map *map;
>>> int err = 0;
>>> u32 config = 0;
>>> @@ -1458,14 +1458,8 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool
>>> break;
>>> }
>>>
>>> - map->pbus_to_dieid[bus] = die_id = uncore_device_to_die(ubox_dev);
>>> -
>>> + map->pbus_to_dieid[bus] = uncore_device_to_die(ubox_dev);
>>> raw_spin_unlock(&pci2phy_map_lock);
>>> -
>>> - if (WARN_ON_ONCE(die_id == -1)) {
>>> - err = -EINVAL;
>>> - break;
>>> - }
>>> }
>>> }
>>>
>>> @@ -6420,7 +6414,7 @@ static void spr_update_device_location(int type_id)
>>>
>>> while ((dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, dev)) != NULL) {
>>>
>>> - die = uncore_device_to_die(dev);
>>> + die = uncore_pcibus_to_dieid(dev->bus);
>>> if (die < 0)
>>> continue;
>>>
>>> @@ -6444,6 +6438,11 @@ static void spr_update_device_location(int type_id)
>>>
>>> int spr_uncore_pci_init(void)
>>> {
>>> + int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
>>> +
>>> + if (ret)
>>> + return ret;
>>> +
>>> /*
>>> * The discovery table of UPI on some SPR variant is broken,
>>> * which impacts the detection of both UPI and M3UPI uncore PMON.