Support standalone metrics and metric groups for perf
From: Andi Kleen
Date: Thu Aug 31 2017 - 15:40:54 EST
Add generic support for standalone metrics specified in JSON files
to perf stat. A metric is a formula that uses multiple events
to compute a higher level result (e.g. IPC).
For more complex metrics we need to have micro architecture
specific knowledge, so it makes sense to tie metrics to
JSON event lists.
Previously metrics were always tied to an event and automatically
enabled with that event. But now change it that we can have
standalone metrics. They are in the same JSON data structure
as events, but don't have an event name, only a metric name.
We also allow to organize the metrics in metric groups, which
allows a short cut to select several related metrics at once.
This patch kit adds the code to perf to manage metric groups
The first few patches are generic bug fixes and can be applied
directly. Then there is a 'weak group' feature that is useful
independently from metrics. After there are metrics specific
patches.
The patches are available in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/metric-group-6
The actual Intel JSON metrics are available in git as a separate pull
request in
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git perf/intel-json-metrics-2
Some example output:
% perf list metricgroup
..
Metric Groups:
DSB:
DSB_Coverage
[Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
FLOPS:
GFLOPs
[Giga Floating Point Operations Per Second]
Frontend:
IFetch_Line_Utilization
[Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions]
Frontend_Bandwidth:
DSB_Coverage
[Fraction of Uops delivered by the DSB (aka Decoded Icache; or Uop Cache)]
Memory_BW:
MLP
[Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)]
% perf stat -M Summary --metric-only -a sleep 1
Performance counter stats for 'system wide':
Instructions CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization
317614222.0 1392930775.0 0.0 0.0 0.2 0.1
1.001497549 seconds time elapsed
% perf stat -M GFLOPs flops
Performance counter stats for 'flops':
3,999,541,471 fp_comp_ops_exe.sse_scalar_single # 1.2 GFLOPs (66.65%)
14 fp_comp_ops_exe.sse_scalar_double (66.65%)
0 fp_comp_ops_exe.sse_packed_double (66.67%)
0 fp_comp_ops_exe.sse_packed_single (66.70%)
0 simd_fp_256.packed_double (66.70%)
0 simd_fp_256.packed_single (66.67%)
3.238372845 seconds time elapsed
v1: Initial post
v2: Address all review feedback (see individual patches)
BPF now works again.
Fix some bugs in perf list printing that I added last minute last time.
v3: Address all review feedback. Some patches are split. Rebased.
Not caching cpuids because it's too complicated.