Re: [PATCH v3 03/18] perf jevents: Add set of common metrics based on default ones
From: Ian Rogers
Date: Tue Nov 11 2025 - 13:38:48 EST
On Mon, Nov 10, 2025 at 10:37 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Mon, Nov 10, 2025 at 08:04:02PM -0800, Ian Rogers wrote:
> > Add support to getting a common set of metrics from a default
> > table. It simplifies the generation to add json metrics at the same
> > time. The metrics added are CPUs_utilized, cs_per_second,
> > migrations_per_second, page_faults_per_second, insn_per_cycle,
> > stalled_cycles_per_instruction, frontend_cycles_idle,
> > backend_cycles_idle, cycles_frequency, branch_frequency and
> > branch_miss_rate based on the shadow metric definitions.
> >
> > Following this change the default perf stat output on an alderlake looks like:
> > ```
> > $ perf stat -a -- sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > 28,165,735,434 cpu-clock # 27.973 CPUs utilized
> > 23,220 context-switches # 824.406 /sec
> > 833 cpu-migrations # 29.575 /sec
> > 35,293 page-faults # 1.253 K/sec
> > 997,341,554 cpu_atom/instructions/ # 0.84 insn per cycle (35.63%)
> > 11,197,053,736 cpu_core/instructions/ # 1.97 insn per cycle (58.21%)
> > 1,184,871,493 cpu_atom/cycles/ # 0.042 GHz (35.64%)
> > 5,676,692,769 cpu_core/cycles/ # 0.202 GHz (58.22%)
> > 150,525,309 cpu_atom/branches/ # 5.344 M/sec (42.80%)
> > 2,277,232,030 cpu_core/branches/ # 80.851 M/sec (58.21%)
> > 5,248,575 cpu_atom/branch-misses/ # 3.49% of all branches (42.82%)
> > 28,829,930 cpu_core/branch-misses/ # 1.27% of all branches (58.22%)
> > (software) # 824.4 cs/sec cs_per_second
> > TopdownL1 (cpu_core) # 12.6 % tma_bad_speculation
> > # 28.8 % tma_frontend_bound (66.57%)
> > TopdownL1 (cpu_core) # 25.8 % tma_backend_bound
> > # 32.8 % tma_retiring (66.57%)
> > (software) # 1253.1 faults/sec page_faults_per_second
> > # 0.0 GHz cycles_frequency (42.80%)
> > # 0.2 GHz cycles_frequency (74.92%)
> > TopdownL1 (cpu_atom) # 22.3 % tma_bad_speculation
> > # 17.2 % tma_retiring (49.95%)
> > TopdownL1 (cpu_atom) # 30.6 % tma_backend_bound
> > # 29.8 % tma_frontend_bound (49.94%)
> > (cpu_atom) # 6.9 K/sec branch_frequency (42.89%)
> > # 80.5 K/sec branch_frequency (74.93%)
> > # 29.6 migrations/sec migrations_per_second
> > # 28.0 CPUs CPUs_utilized
> > (cpu_atom) # 0.8 instructions insn_per_cycle (42.91%)
> > # 2.0 instructions insn_per_cycle (75.14%)
> > (cpu_atom) # 3.8 % branch_miss_rate (35.75%)
> > # 1.2 % branch_miss_rate (66.86%)
> >
> > 1.007063529 seconds time elapsed
> > ```
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > .../arch/common/common/metrics.json | 86 +++++++++++++
> > tools/perf/pmu-events/empty-pmu-events.c | 115 +++++++++++++-----
> > tools/perf/pmu-events/jevents.py | 21 +++-
> > tools/perf/pmu-events/pmu-events.h | 1 +
> > tools/perf/util/metricgroup.c | 31 +++--
> > 5 files changed, 212 insertions(+), 42 deletions(-)
> > create mode 100644 tools/perf/pmu-events/arch/common/common/metrics.json
> >
> > diff --git a/tools/perf/pmu-events/arch/common/common/metrics.json b/tools/perf/pmu-events/arch/common/common/metrics.json
> > new file mode 100644
> > index 000000000000..d1e37db18dc6
> > --- /dev/null
> > +++ b/tools/perf/pmu-events/arch/common/common/metrics.json
> > @@ -0,0 +1,86 @@
> > +[
> > + {
> > + "BriefDescription": "Average CPU utilization",
> > + "MetricExpr": "(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)",
> > + "MetricGroup": "Default",
> > + "MetricName": "CPUs_utilized",
> > + "ScaleUnit": "1CPUs",
> > + "MetricConstraint": "NO_GROUP_EVENTS"
> > + },
> > + {
> > + "BriefDescription": "Context switches per CPU second",
> > + "MetricExpr": "(software@context\\-switches\\,name\\=context\\-switches@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
> > + "MetricGroup": "Default",
> > + "MetricName": "cs_per_second",
> > + "ScaleUnit": "1cs/sec",
> > + "MetricConstraint": "NO_GROUP_EVENTS"
> > + },
> > + {
> > + "BriefDescription": "Process migrations to a new CPU per CPU second",
> > + "MetricExpr": "(software@cpu\\-migrations\\,name\\=cpu\\-migrations@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
> > + "MetricGroup": "Default",
> > + "MetricName": "migrations_per_second",
> > + "ScaleUnit": "1migrations/sec",
> > + "MetricConstraint": "NO_GROUP_EVENTS"
> > + },
> > + {
> > + "BriefDescription": "Page faults per CPU second",
> > + "MetricExpr": "(software@page\\-faults\\,name\\=page\\-faults@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
> > + "MetricGroup": "Default",
> > + "MetricName": "page_faults_per_second",
> > + "ScaleUnit": "1faults/sec",
> > + "MetricConstraint": "NO_GROUP_EVENTS"
> > + },
> > + {
> > + "BriefDescription": "Instructions Per Cycle",
> > + "MetricExpr": "instructions / cpu\\-cycles",
> > + "MetricGroup": "Default",
> > + "MetricName": "insn_per_cycle",
> > + "MetricThreshold": "insn_per_cycle < 1",
> > + "ScaleUnit": "1instructions"
> > + },
> > + {
> > + "BriefDescription": "Max front or backend stalls per instruction",
> > + "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
> > + "MetricGroup": "Default",
> > + "MetricName": "stalled_cycles_per_instruction"
> > + },
> > + {
> > + "BriefDescription": "Frontend stalls per cycle",
> > + "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
> > + "MetricGroup": "Default",
> > + "MetricName": "frontend_cycles_idle",
> > + "MetricThreshold": "frontend_cycles_idle > 0.1"
> > + },
> > + {
> > + "BriefDescription": "Backend stalls per cycle",
> > + "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
> > + "MetricGroup": "Default",
> > + "MetricName": "backend_cycles_idle",
> > + "MetricThreshold": "backend_cycles_idle > 0.2"
> > + },
> > + {
> > + "BriefDescription": "Cycles per CPU second",
> > + "MetricExpr": "cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
> > + "MetricGroup": "Default",
> > + "MetricName": "cycles_frequency",
> > + "ScaleUnit": "1GHz",
> > + "MetricConstraint": "NO_GROUP_EVENTS"
> > + },
> > + {
> > + "BriefDescription": "Branches per CPU second",
> > + "MetricExpr": "branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@task\\-clock\\,name\\=task\\-clock@)",
> > + "MetricGroup": "Default",
> > + "MetricName": "branch_frequency",
> > + "ScaleUnit": "1000K/sec",
>
> We talked it should be 1000M/sec, right?
Thanks! Will fix in v4 - I thought this was fixed in v2, but I must
have missed it.
Ian