Re: [RFC] perf tools: About encodings of legacy event names

From: James Clark
Date: Fri Mar 07 2025 - 09:18:03 EST




On 24/02/2025 3:01 pm, Arnaldo Carvalho de Melo wrote:
On Wed, Feb 19, 2025 at 10:37:33PM -0800, Ian Rogers wrote:
On Wed, Feb 19, 2025 at 4:38 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
Ian and I have been discussing the behaviors of event encodings and it's
hard to find a point so far that we can agree on. So I'd like to hear
your opinion to resolve this. If you happen to have some time, you can
follow the thread here:

https://lore.kernel.org/linux-perf-users/20250109222109.567031-5-irogers@xxxxxxxxxx/#r

Basically there are some events that were defined in the perf ABI.

PERF_COUNT_HW_CPU_CYCLES, PERF_COUNT_HW_INSTRUCTIONS, ...

So perf tools use those (legacy) encodings with the matching names like
"cycles" (or "cpu-cycles"), "instructions", etc.

On the another hand, it has wildcard matching for event names in PMUs so
that users can conveniently use events without specifying PMU names.
For example, "event1" would expand to "pmuX/event1/", "pmuY/event1/", etc
as long as the PMUs have the event in sysfs or JSON.

And there are platforms where "cycles" event is defined in a (core) PMU
(like "blah/cycles") and the event has a different behavior than the
legacy encoding. Then it has to choose which encoding is better for the
given name. But it's a general issue for the legacy event names.

So we pick the "legacy" one, as always, and the one that is in a PMU
needs to have its pmu name specified, no?

Q. what should it do with "cycles"?
-----------------------------------
1. just use the legacy encoding (PERF_COUNT_HW_CPU_CYCLES).

Right

2. expand to (possibly multiple) PMU events (like "cpu/cycles/") and
use the encoding defined there.

Nope

I think the #1 is the current behavior. The upside is it's simple and
intuitive. But it's different than other names and can make confusion.
Users should use the full name ("cpu/cycles/") if they need sysfs one.

Right
So the code keeps changing, for example, the removal of BPF events. I

Humm, this seems like a different discussion.

think the important change and the thing that has brought us here is
the addition of what Intel call hybrid and ARM call BIG.little. ARM

Right, the support for hybrid systems brought lots of problems, most
people didn't have access to such systems and thus couldn't test
patches, so the attempt was to keep the existing code working as it had
been while allowing for the support for the new hybrid systems.

As more people get access to hybrid systems we started noticing problems
and working on fixing it, you did a lot of work in this area, highly
appreciated.

have got architectural events and so the same event encoding on BIG
and little cores. On x86 the e-core (atom) and p-cores have different
event encodings. In the original hybrid work, pushed on for the launch
of Alderlake and reviewed by Jiri and Arnaldo with no involvement from
me, Intel wanted the event names to work transparently. So a cycles

Without access to such systems, yes, see above.

event would be gathered on the e-core and p-core with a command line
option to merge the legacy event cycles on both cores. To be specific
the expected behavior of:
$ perf (stat|record) -e cycles ...
would be as if:
$ perf (stat|record) -e cpu_atom/cycles/,cpu_core/cycles/ ...

Yes, and if somebody wants to profile in just one of those core types,
just specify the "cpu_atom" or "cpu_core" to have it fully specified.
An unfortunate thing in the hybrid code was that it was hardcoded to
PMU names of cpu_atom and cpu_core, but what to do for metrics? The

Yeah, metrics IIRC came before hybrid systems.

original proposal was that metrics would have a PMU name and all names
would be relative to that, but what about uncore events? What about
metrics that refer to other metrics? A decision was made to not to
have PMUs implicitly in metrics and the events in the metric would
match those given to perf's -e command line. This also greatly
simplifies testing events when creating a metric.

I set about rewriting the hybrid code not to use hard coded PMU names
but to discover core PMUs at runtime. This was to make the metric and
event parsing code as generic as possible, but this had an unintended
consequence. ARM's core PMU had previously not been seen as handling
legacy events like cycles, and appeared as an uncore PMU. When there

are both legacy and sysfs/json events then previously the legacy
events had priority. This broke events like cycles on Apple M
processors where the legacy events were broken and the sysfs ones not.
Yes this is a driver bug, but everybody told me a change in behavior
of the tool was needed to fix it, that fix was to make sysfs/json
events have priority over legacy events. I prioritized fixing event
parsing when a PMU was specified but given "cycles" means
"cpu_atom/cycles/ and cpu_core/cycles/" for hybrid, why would the
prioritization be different without a PMU specified?
I knew of this tech debt and separately RISC-V was also interested to
have sysfs/json be the priority so that the legacy to config encoding
could exist more in the perf tool than the PMU driver. I'm a SIG

I saw them saying that supporting PERF_TYPE_HARDWARE counters was ok as
they didn't want to break the perf tooling workflow, no?


Doesn't most of the discussion stem from this particular point? I also understood it that way, that risc-v folks agreed it was better to support these to make all existing software work, not just Perf.

Maybe one issue was calling them 'legacy' events in the first place, and I'm not sure if there is complete consensus that these are legacy. Can't they continue be the short easy list of events likely to be common across platforms? If there is an issue with some of them being wrong in some places we can move forward from that by making sure new platforms do it right, rather than changing the logic for everyone to fix that bug.

For the argument that Google prefers to use the sysfs events because of these differences, I don't think there is anything preventing that kind of use today? Or at least not for the main priority flip proposed, but maybe there are some smaller adjacent bugs that can be fixed up separately.

Thanks
James