[PATCH 0/3] perf list: Collapse similar events across PMUs

From: James Clark
Date: Tue Mar 04 2025 - 08:50:22 EST


Some changes related to the discussion here [1] about deduplication of
PMUs. This change stands on its own right now, but it may not go far
enough. This can either be dropped for now or applied and improved
later. Extra ideas are as follows.

Treating alphanumeric suffixes as duplicate has been slightly
problematic due to marketing strings having looks-like-but-not-actually
alphanumeric suffixes. For example 'cpum_cf' and now the one digit
longer than before 'cortex-a720'. The easy fix is to increase the
minimum length considered for deduplication as suggested [1], but as
also mentioned, the current mrvl_ddr_pmu PMU names don't zero pad the
address, meaning that > 2 alphanumeric suffixes are already technically
not enough to deduplicate the same PMUs. They could have only a 2 digit
alphanumeric address suffix. Increasing the minimum digits feels a bit
like kicking the can down the road and it places awkward limitations on
marketing names which we have no control over. Also I'm not sure helps
the following much:

The problem is that arm_cmn_[n] PMUs have a numeric suffix, but they can
have different events. Even if we were adding this PMU today, keeping
the suffix rule in mind, it would be difficult to come up with a suffix
that differentiates the different ones. Flavour words might work, but
that complicates the kernel which would have to group them and come up
with flavours rather than just doing an i++. Deduplicating too
aggressively on only PMU name suffix means only arm_cmn_1's events get
listed, missing other events, and it's hard to see which events relate
to which PMU.

Therefore in addition to the changes in this patchset I'd like to look
into:

* Collapsing duplicate PMU names into ranges, for example
arm_pmu_v3_[0-4], rather than simply concatenating names as done in
this patchset
* Deduplicate uncore based on the contents of events/ rather than just
the suffix

As some background, the original commit for deduplication, commit
3241d46f5f54 ("perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu")
mentions reducing the number of duplicate PMUs, and is presumably
motivated by usability. But there are also other commits mentioning
reducing openat()s, for example lazily loading formats 504026412162
("perf pmu: Make the loading of formats lazy"). Deduplicating based on
the contents of the events/ folder is somewhat in contention with this
reduction, but could be done along side some more lazy loading (like of
the terms) and hashing the result of readdir() without opening any of
the contents. JSON tables can have event name hashes calculated at build
time if we want to consider them for deduplication too.

Then with the events hash, PMU's can be sorted based on this and the
'Unit:' string can be constructed with a set of values that collapses
adjacent suffixes to display as ranges. I believe that could remove the
need for any further changes to duplication based on suffix, but still
avoids over deduplication.

[1]: https://lore.kernel.org/linux-perf-users/CAP-5=fW_Sq4iFxoWPWuixz9fMLBPyPUO0RG0KPbYa-5T0DZbTA@xxxxxxxxxxxxxx/

---
James Clark (3):
perf list: Order events by event name before PMU name
perf list: Collapse similar events across PMUs
perf list: Don't deduplicate core PMUs when listing events

tools/perf/builtin-list.c | 2 +
tools/perf/util/pmu.c | 5 ++-
tools/perf/util/pmu.h | 2 +
tools/perf/util/pmus.c | 95 ++++++++++++++++++++++++++++++++++--------
tools/perf/util/print-events.h | 1 +
5 files changed, 86 insertions(+), 19 deletions(-)
---
base-commit: 7788ad59d1d9617792037a83513be5b1dd14150f
change-id: 20250304-james-perf-hybrid-list-11b888cf435a

Best regards,
--
James Clark <james.clark@xxxxxxxxxx>