Re: [PATCH 6/6] tools/perf/json: Add metric for tlb and cache s390

From: Ian Rogers
Date: Mon Mar 13 2023 - 11:32:44 EST


On Mon, Mar 13, 2023 at 1:30 AM Thomas Richter <tmricht@xxxxxxxxxxxxx> wrote:
>
> Add metrics for tlb and cache statistics:
> - finite_cpi: Cycles per Instructions from Finite cache/memory
> - est_cpi: Estimated Instruction Complexity CPI infinite Level 1
> - scpl1m: Estimated Sourcing Cycles per Level 1 Miss
> - tlb_percent: Estimated TLB CPU percentage of Total CPU
> - tlb_miss: Estimated Cycles per TLB Miss
>
> For details about the formulas see this documentation:
> https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf
>
> Output after:
> # ./perf stat -M tlb_miss -- dd if=/dev/zero of=/dev/null bs=1M count=10K
> ... dd output removed
>
> Performance counter stats for
> 'dd if=/dev/zero of=/dev/null bs=1M count=10K':
>
> 667,726 DTLB2_MISSES # 440.96 tlb_miss
> 198 ITLB2_WRITES
> 795,170,260 L1C_TLB2_MISSES
> 9,478 ITLB2_MISSES
> 820 DTLB2_WRITES
> 1,197,126,869 L1D_PENALTY_CYCLES
> 2,457,447 L1I_PENALTY_CYCLES
>
> 1.249342187 seconds time elapsed
>
> 0.001030000 seconds user
> 1.248105000 seconds sys
>
> #
>
> Signed-off-by: Thomas Richter <tmricht@xxxxxxxxxxxxx>
> Acked-By: Sumanth Korikkar <sumanthk@xxxxxxxxxxxxx>
> ---
> .../arch/s390/cf_z13/transaction.json | 30 +++++++++++++++++++
> .../arch/s390/cf_z14/transaction.json | 25 ++++++++++++++++
> .../arch/s390/cf_z15/transaction.json | 25 ++++++++++++++++
> .../arch/s390/cf_z16/transaction.json | 25 ++++++++++++++++
> 4 files changed, 105 insertions(+)
>
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> index 71e2c7fa734c..b941a7212a4d 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z13/transaction.json
> @@ -43,5 +43,35 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONNODE_MEM_SOURCED_WRITES + L1D_ONDRAWER_MEM_SOURCED_WRITES + L1D_OFFDRAWER_MEM_SOURCED_WRITES + L1D_ONCHIP_MEM_SOURCED_WRITES + L1I_ONNODE_MEM_SOURCED_WRITES + L1I_ONDRAWER_MEM_SOURCED_WRITES + L1I_OFFDRAWER_MEM_SOURCED_WRITES + L1I_ONCHIP_MEM_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB1_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB1_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB1_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / CPU_CYCLES) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"

Looks good again but perhaps the ScaleUnit change. If you'd prefer to
keep as-is for consistency I'm happy to add my Acked-by.

Thanks,
Ian

> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB1_MISSES + ITLB1_MISSES) / (DTLB1_WRITES + ITLB1_WRITES)) * (L1C_TLB1_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> + },
> + {
> + "BriefDescription": "Page Table Entry misses",
> + "MetricName": "pte_miss",
> + "MetricExpr": "(TLB2_PTE_WRITES / (DTLB1_WRITES + ITLB1_WRITES)) * 100"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> index cca237bdb7ba..ce814ea93396 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z14/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> index cca237bdb7ba..ce814ea93396 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z15/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((L1D_ONCHIP_MEMORY_SOURCED_WRITES + L1D_ONCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1D_OFFDRAWER_MEMORY_SOURCED_WRITES + L1I_ONCHIP_MEMORY_SOURCED_WRITES + L1I_ONCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFCLUSTER_MEMORY_SOURCED_WRITES + L1I_OFFDRAWER_MEMORY_SOURCED_WRITES) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> diff --git a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> index dde0735a7d22..ec2ff78e2b5f 100644
> --- a/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> +++ b/tools/perf/pmu-events/arch/s390/cf_z16/transaction.json
> @@ -43,5 +43,30 @@
> "BriefDescription": "Percentage sourced from memory",
> "MetricName": "memp",
> "MetricExpr": "((DCW_ON_CHIP_MEMORY + DCW_ON_MODULE_MEMORY + DCW_ON_DRAWER_MEMORY + DCW_OFF_DRAWER_MEMORY + ICW_ON_CHIP_MEMORY + ICW_ON_MODULE_MEMORY + ICW_ON_DRAWER_MEMORY + ICW_OFF_DRAWER_MEMORY) / (L1I_DIR_WRITES + L1D_DIR_WRITES)) * 100"
> + },
> + {
> + "BriefDescription": "Cycles per Instructions from Finite cache/memory",
> + "MetricName": "finite_cpi",
> + "MetricExpr": "L1C_TLB2_MISSES / INSTRUCTIONS"
> + },
> + {
> + "BriefDescription": "Estimated Instruction Complexity CPI infinite Level 1",
> + "MetricName": "est_cpi",
> + "MetricExpr": "(CPU_CYCLES / INSTRUCTIONS) - (L1C_TLB2_MISSES / INSTRUCTIONS)"
> + },
> + {
> + "BriefDescription": "Estimated Sourcing Cycles per Level 1 Miss",
> + "MetricName": "scpl1m",
> + "MetricExpr": "L1C_TLB2_MISSES / (L1I_DIR_WRITES + L1D_DIR_WRITES)"
> + },
> + {
> + "BriefDescription": "Estimated TLB CPU percentage of Total CPU",
> + "MetricName": "tlb_percent",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / CPU_CYCLES) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES)) * 100"
> + },
> + {
> + "BriefDescription": "Estimated Cycles per TLB Miss",
> + "MetricName": "tlb_miss",
> + "MetricExpr": "((DTLB2_MISSES + ITLB2_MISSES) / (DTLB2_WRITES + ITLB2_WRITES)) * (L1C_TLB2_MISSES / (L1I_PENALTY_CYCLES + L1D_PENALTY_CYCLES))"
> }
> ]
> --
> 2.39.1
>