Re: [PATCH v4 03/18] perf jevents: Add set of common metrics based on default ones

From: James Clark
Date: Tue Nov 18 2025 - 05:58:10 EST




On 18/11/2025 7:29 am, Namhyung Kim wrote:
On Mon, Nov 17, 2025 at 06:28:31PM -0800, Ian Rogers wrote:
On Mon, Nov 17, 2025 at 5:37 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:

On Sat, Nov 15, 2025 at 07:29:29PM -0800, Ian Rogers wrote:
On Sat, Nov 15, 2025 at 9:52 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:

On Fri, Nov 14, 2025 at 08:57:39AM -0800, Ian Rogers wrote:
On Fri, Nov 14, 2025 at 8:28 AM James Clark <james.clark@xxxxxxxxxx> wrote:



On 11/11/2025 9:21 pm, Ian Rogers wrote:
Add support to getting a common set of metrics from a default
table. It simplifies the generation to add json metrics at the same
time. The metrics added are CPUs_utilized, cs_per_second,
migrations_per_second, page_faults_per_second, insn_per_cycle,
stalled_cycles_per_instruction, frontend_cycles_idle,
backend_cycles_idle, cycles_frequency, branch_frequency and
branch_miss_rate based on the shadow metric definitions.

Following this change the default perf stat output on an alderlake
looks like:
```
$ perf stat -a -- sleep 2

Performance counter stats for 'system wide':

0.00 msec cpu-clock # 0.000 CPUs utilized
77,739 context-switches
15,033 cpu-migrations
321,313 page-faults
14,355,634,225 cpu_atom/instructions/ # 1.40 insn per cycle (35.37%)
134,561,560,583 cpu_core/instructions/ # 3.44 insn per cycle (57.85%)
10,263,836,145 cpu_atom/cycles/ (35.42%)
39,138,632,894 cpu_core/cycles/ (57.60%)
2,989,658,777 cpu_atom/branches/ (42.60%)
32,170,570,388 cpu_core/branches/ (57.39%)
29,789,870 cpu_atom/branch-misses/ # 1.00% of all branches (42.69%)
165,991,152 cpu_core/branch-misses/ # 0.52% of all branches (57.19%)
(software) # nan cs/sec cs_per_second
TopdownL1 (cpu_core) # 11.9 % tma_bad_speculation
# 19.6 % tma_frontend_bound (63.97%)
TopdownL1 (cpu_core) # 18.8 % tma_backend_bound
# 49.7 % tma_retiring (63.97%)
(software) # nan faults/sec page_faults_per_second
# nan GHz cycles_frequency (42.88%)
# nan GHz cycles_frequency (69.88%)
TopdownL1 (cpu_atom) # 11.7 % tma_bad_speculation
# 29.9 % tma_retiring (50.07%)
TopdownL1 (cpu_atom) # 31.3 % tma_frontend_bound (43.09%)
(cpu_atom) # nan M/sec branch_frequency (43.09%)
# nan M/sec branch_frequency (70.07%)
# nan migrations/sec migrations_per_second
TopdownL1 (cpu_atom) # 27.1 % tma_backend_bound (43.08%)
(software) # 0.0 CPUs CPUs_utilized
# 1.4 instructions insn_per_cycle (43.04%)
# 3.5 instructions insn_per_cycle (69.99%)
# 1.0 % branch_miss_rate (35.46%)
# 0.5 % branch_miss_rate (65.02%)

2.005626564 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
---
.../arch/common/common/metrics.json | 86 +++++++++++++
tools/perf/pmu-events/empty-pmu-events.c | 115 +++++++++++++-----
tools/perf/pmu-events/jevents.py | 21 +++-
tools/perf/pmu-events/pmu-events.h | 1 +
tools/perf/util/metricgroup.c | 31 +++--
5 files changed, 212 insertions(+), 42 deletions(-)
create mode 100644 tools/perf/pmu-events/arch/common/common/metrics.json

diff --git a/tools/perf/pmu-events/arch/common/common/metrics.json b/tools/perf/pmu-events/arch/common/common/metrics.json
new file mode 100644
index 000000000000..d915be51e300
--- /dev/null
+++ b/tools/perf/pmu-events/arch/common/common/metrics.json
@@ -0,0 +1,86 @@
+[
+ {
+ "BriefDescription": "Average CPU utilization",
+ "MetricExpr": "(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)",

Hi Ian,

I noticed that this metric is making "perf stat tests" fail.
"duration_time" is a tool event and they don't work with "perf stat
record" anymore. The test tests the record command with the default args
which results in this event being used and a failure.

I suppose there are three issues. First two are unrelated to this change:

- Perf stat record continues to write out a bad perf.data file even
though it knows that tool events won't work.

For example 'status' ends up being -1 in cmd_stat() but it's ignored
for some of the writing parts. It does decide to not print any stdout
though:

$ perf stat record -e "duration_time"
<blank>

- The other issue is obviously that tool events don't work with perf
stat record which seems to be a regression from 6828d6929b76 ("perf
evsel: Refactor tool events")

- The third issue is that this change adds a broken tool event to the
default output of perf stat

I'm not actually sure what "perf stat record" is for? It's possible that
it's not used anymore, expecially if nobody noticed that tool events
haven't been working in it for a while.

I think we're also supposed to have json output for perf stat (although
this is also broken in some obscure scenarios), so maybe perf stat
record isn't needed anymore?

Hi James,

Thanks for the report. I think this is also an overlap with perf stat
metrics don't work with perf stat record, and because these changes
made that the default. Let me do some follow up work as the perf
script work shows we can do useful things with metrics while not being
on a live perf stat - there's the obstacle that the CPUID of the host
will be used :-/

Anyway, I'll take a look and we should add a test on this. There is
one that the perf stat json output is okay, to some definition. One
problem is that the stat-display code is complete spaghetti. Now that
stat-shadow only handles json metrics, and perf script isn't trying to
maintain a set of shadow counters, that is a little bit improved.

I have another test failure on this. On my AMD machine, perf all
metrics test fails due to missing "LLC-loads" events.

$ sudo perf stat -M llc_miss_rate true
Error:
No supported events found.
The LLC-loads event is not supported.

Maybe we need to make some cache metrics conditional as some events are
missing.

Maybe we can `perf list Default`, etc. for this is a problem. We have
similar unsupported events in metrics on Intel like:

```
$ perf stat -M itlb_miss_rate -a sleep 1

Performance counter stats for 'system wide':

<not supported> iTLB-loads
168,926 iTLB-load-misses

1.002287122 seconds time elapsed
```

but I've not seen failures:

```
$ perf test -v "all metrics"
103: perf all metrics test : Skip
```

$ sudo perf test -v "all metrics"
--- start ---
test child forked, pid 1347112
Testing CPUs_utilized
Testing backend_cycles_idle
Not supported events
Performance counter stats for 'system wide': <not counted> cpu-cycles <not supported> stalled-cycles-backend 0.013162328 seconds time elapsed
Testing branch_frequency
Testing branch_miss_rate
Testing cs_per_second
Testing cycles_frequency
Testing frontend_cycles_idle
Testing insn_per_cycle
Testing migrations_per_second
Testing page_faults_per_second
Testing stalled_cycles_per_instruction
Testing l1d_miss_rate
Testing llc_miss_rate
Metric contains missing events
Error: No supported events found. The LLC-loads event is not supported.

Right, but this should match the Intel case as iTLB-loads is an
unsupported event so I'm not sure why we don't see a failure on Intel
but do on AMD given both events are legacy cache ones. I'll need to
trace through the code (or uftrace it :-) ).
^^^^^^^^^^^^^^^^^
That'd be fun! ;-)

Thanks,
Namhyung


There is the same "LLC-loads event is not supported" issue with this test on Arm too. (But it's from patch 5 rather than 3, just for the avoidance of confusion).