[tip:perf/core] perf vendor events: Add JSON metrics for Skylake

From: tip-bot for Andi Kleen
Date: Fri Sep 22 2017 - 12:40:18 EST


Commit-ID: 2e006a24127ad88422632f9e5d6a8039a40f01da
Gitweb: http://git.kernel.org/tip/2e006a24127ad88422632f9e5d6a8039a40f01da
Author: Andi Kleen <ak@xxxxxxxxxxxxxxx>
AuthorDate: Sun, 23 Jul 2017 21:54:49 -0700
Committer: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
CommitDate: Wed, 13 Sep 2017 09:49:17 -0300

perf vendor events: Add JSON metrics for Skylake

Add JSON metrics for Skylake.

Committer testing:

# grep "model name" /proc/cpuinfo | head -1
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
# uname -a
Linux seventh 4.12.0-rc6+ #1 SMP Fri Jun 30 16:40:55 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
# perf stat --metric-only -M Summary -a sleep 1

Performance counter stats for 'system wide':

Instructions CPI CLKS CPU_Utilization GFLOPs SMT_2T_Utilization Kernel_Utilization
34021097.0 0.0 119424171.0 0.0 0.0 0.0 0.0

1.001001793 seconds time elapsed

# perf list metricgroup

List of pre-defined events (to be used in -e):

Metric Groups:

DSB
FLOPS
Frontend
Frontend_Bandwidth
Memory_BW
Memory_Bound
Memory_Lat
Pipeline
Ports_Utilization
Power
SMT
Summary
TLB
TopDownL1
Unknown_Branches
# perf stat --metric-only -M Ports_Utilization -a sleep 1

Performance counter stats for 'system wide':

ILP
1475828.0

1.000688547 seconds time elapsed

# perf stat -v --metric-only -M Ports_Utilization -a sleep 1
Using CPUID GenuineIntel-6-9E
metric expr uops_executed.thread / ( uops_executed.core_cycles_ge_1 / 2) if #smt_on else uops_executed.core_cycles_ge_1 for ILP
found event uops_executed.thread
found event uops_executed.core_cycles_ge_1
adding {uops_executed.thread,uops_executed.core_cycles_ge_1}:W
uops_executed.thread -> cpu/umask=0x1,period=2000003,event=0xb1/
uops_executed.core_cycles_ge_1 -> cpu/umask=0x2,period=2000003,cmask=1,event=0xb1/
uops_executed.thread: 8115271 4002547654 4002547654
uops_executed.core_cycles_ge_1: 3282969 4002547654 4002547654

Performance counter stats for 'system wide':

ILP
3282969.0

1.000719870 seconds time elapsed

#

Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Tested-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Link: http://lkml.kernel.org/r/20170905195235.GW2482@xxxxxxxxxxxxxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
---
.../{broadwell/bdw-metrics.json => skylake/skl-metrics.json} | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
similarity index 85%
copy from tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
copy to tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
index 49c5f12..411f941 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
@@ -13,7 +13,7 @@
},
{
"BriefDescription": "Rough Estimation of fraction of fetched lines bytes that were likely consumed by program instructions",
- "MetricExpr": "min( 1 , IDQ.MITE_UOPS / ( UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 16 * ( ICACHE.HIT + ICACHE.MISSES ) / 4.0 ) )",
+ "MetricExpr": "min( 1 , UOPS_ISSUED.ANY / (UOPS_RETIRED.RETIRE_SLOTS / INST_RETIRED.ANY * 64 * ( ICACHE_64B.IFTAG_HIT + ICACHE_64B.IFTAG_MISS ) / 4.1) )",
"MetricGroup": "Frontend",
"MetricName": "IFetch_Line_Utilization"
},
@@ -55,13 +55,13 @@
},
{
"BriefDescription": "Instruction-Level-Parallelism (average number of uops executed when there is at least 1 uop executed)",
- "MetricExpr": "UOPS_EXECUTED.THREAD / ( cpu@xxxxxxxxxxxxxxxxxx\\,cmask\\=1@ / 2) if #SMT_on else UOPS_EXECUTED.CYCLES_GE_1_UOP_EXEC",
+ "MetricExpr": "UOPS_EXECUTED.THREAD / ( UOPS_EXECUTED.CORE_CYCLES_GE_1 / 2) if #SMT_on else UOPS_EXECUTED.CORE_CYCLES_GE_1",
"MetricGroup": "Pipeline;Ports_Utilization",
"MetricName": "ILP"
},
{
"BriefDescription": "Average Branch Address Clear Cost (fraction of cycles)",
- "MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE.IFDATA_STALL - ( 14 * ITLB_MISSES.STLB_HIT + cpu@xxxxxxxxxxxxxxxxxxxxxxxxx\\,cmask\\=1@ + 7* ITLB_MISSES.WALK_COMPLETED ) ) / RS_EVENTS.EMPTY_END",
+ "MetricExpr": "2* ( RS_EVENTS.EMPTY_CYCLES - ICACHE_16B.IFDATA_STALL - ICACHE_64B.IFTAG_STALL ) / RS_EVENTS.EMPTY_END",
"MetricGroup": "Unknown_Branches",
"MetricName": "BAClear_Cost"
},
@@ -73,19 +73,19 @@
},
{
"BriefDescription": "Actual Average Latency for L1 data-cache miss demand loads",
- "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_UOPS_RETIRED.L1_MISS + mem_load_uops_retired.hit_lfb )",
+ "MetricExpr": "L1D_PEND_MISS.PENDING / ( MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT )",
"MetricGroup": "Memory_Bound;Memory_Lat",
"MetricName": "Load_Miss_Real_Latency"
},
{
"BriefDescription": "Memory-Level-Parallelism (average number of L1 miss demand load when there is at least 1 such miss)",
- "MetricExpr": "L1D_PEND_MISS.PENDING / ( cpu@xxxxxxxxxxxxxxxxxxxxxxxxxxxx\\,any\\=1@ / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
+ "MetricExpr": "L1D_PEND_MISS.PENDING / ( L1D_PEND_MISS.PENDING_CYCLES_ANY / 2) if #SMT_on else L1D_PEND_MISS.PENDING_CYCLES",
"MetricGroup": "Memory_Bound;Memory_BW",
"MetricName": "MLP"
},
{
"BriefDescription": "Utilization of the core's Page Walker(s) serving STLB misses triggered by instruction/Load/Store accesses",
- "MetricExpr": "( cpu@xxxxxxxxxxxxxxxxxxxxxxxxx\\,cmask\\=1@ + cpu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\,cmask\\=1@ + cpu@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\\,cmask\\=1@ + 7*(DTLB_STORE_MISSES.WALK_COMPLETED+DTLB_LOAD_MISSES.WALK_COMPLETED+ITLB_MISSES.WALK_COMPLETED)) / ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles",
+ "MetricExpr": "( ITLB_MISSES.WALK_PENDING + DTLB_LOAD_MISSES.WALK_PENDING + DTLB_STORE_MISSES.WALK_PENDING + EPT.WALK_PENDING ) / ( 2 * ( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles )",
"MetricGroup": "TLB",
"MetricName": "Page_Walks_Utilization"
},