Re: [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files

From: Ian Rogers
Date: Thu May 16 2024 - 12:57:48 EST

Next message: Sean Christopherson: "Re: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace"
Previous message: Florian Fainelli: "Re: [PATCH 6.1 000/244] 6.1.91-rc3 review"
Next in thread: Wang, Weilin: "RE: [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, May 14, 2024 at 10:44 PM <weilin.wang@xxxxxxxxx> wrote:
>
> From: Weilin Wang <weilin.wang@xxxxxxxxx>
>
> Add MTL metric json file at TMA4.7 [1]. Some of the metrics' formulas use TPEBS
> retire_latency in MTL.
>
> [1] https://lore.kernel.org/all/20240214011820.644458-1-irogers@xxxxxxxxxx/
>
> Signed-off-by: Weilin Wang <weilin.wang@xxxxxxxxx>
> Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>

This change works either with the approach in this series or with the
evsel approach so I don't mind my reviewed-by standing. I'd prefer we
could have an evsel read counter implementation that returns 0 so that
we can run without retirement latency gathering.

TMA 4.7 is broken in that the tma_lock_latency metric uses a
retirement latency event but not within a max function so having the
read counter return 0 would break the metric:

+ {
+ "BriefDescription": "This metric represents fraction of
cycles the CPU spent handling cache misses due to lock operations",
+ "MetricExpr": "MEM_INST_RETIRED.LOCK_LOADS *
MEM_INST_RETIRED.LOCK_LOADS:R / tma_info_thread_clks",
+ "MetricGroup":
"Offcore;TopdownL4;tma_L4_group;tma_issueRFO;tma_l1_bound_group",
+ "MetricName": "tma_lock_latency",
+ "MetricThreshold": "tma_lock_latency > 0.2 & (tma_l1_bound >
0.1 & (tma_memory_bound > 0.2 & tma_backend_bound > 0.2))",
+ "PublicDescription": "This metric represents fraction of
cycles the CPU spent handling cache misses due to lock operations. Due
to the microarchitecture handling of locks; they are classified as
L1_Bound regardless of what memory source satisfied them. Sample with:
MEM_INST_RETIRED.LOCK_LOADS_PS. Related metrics: tma_store_latency",
+ "ScaleUnit": "100%",
+ "Unit": "cpu_core"
+ },

Other metrics then use that metric specifically
tma_info_bottleneck_memory_data_tlbs and
tma_info_bottleneck_cache_memory_bandwidth.

I couldn't see in the TMA 4.8 release the updated MTL metrics:
https://github.com/intel/perfmon/pull/181/commits/d54c847b2f863c98a917bdd31a0680f4d50ff75c
but my belief is that this issue hasn't been addressed.

Thanks,
Ian

Next message: Sean Christopherson: "Re: [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace"
Previous message: Florian Fainelli: "Re: [PATCH 6.1 000/244] 6.1.91-rc3 review"
Next in thread: Wang, Weilin: "RE: [RFC PATCH v8 6/7] perf vendor events intel: Add MTL metric json files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]