Re: [PATCH v2 00/23] Intel vendor events and TMA 5.01 metrics
From: Ian Rogers
Date: Fri Jan 17 2025 - 14:43:21 EST
On Fri, Jan 17, 2025 at 11:10 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2025-01-17 11:03 a.m., Liang, Kan wrote:
> >
> >
> > On 2025-01-16 1:43 a.m., Ian Rogers wrote:
> >> Update the Intel vendor events to the latest.
> >> Update the metrics to TMA 5.01.
> >> Add Arrowlake and Clearwaterforest support.
> >> Add metrics for LNL and GNR.
> >> Address IIO uncore issue spotted on EMR, GRR, GNR, SPR and SRF.
> >>
> >> The perf json was generated using the script:
> >> https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
> >> with the generated json being in:
> >> https://github.com/intel/perfmon/tree/main/scripts/perf
> >>
> >> Thanks to Perry Taylor <perry.taylor@xxxxxxxxx>, Caleb Biggers
> >> <caleb.biggers@xxxxxxxxx>, Edward Baker <edward.baker@xxxxxxxxx> and
> >> Weilin Wang <weilin.wang@xxxxxxxxx> for helping get this patch series
> >> together.
> >>
> >> v2: Fix hybrid and Co-authored-by tag issues reported by
> >> Arnaldo. Updates to Lunarlake and Meteorlake events. Addition of
> >> Clearwaterforest.
> >
> > Thanks Ian!
> >
> > Acked-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> >
>
> Thanks Thomas to do more tests for the series.
>
> There is an issue for the FP_ARITH* related metrics on hybrid platforms.
> I have to take the acked-by back. Sorry for the noise.
>
> Here is the issue on ADL and ARL.
>
> $ sudo ./perf stat -M tma_info_inst_mix_iparith -a sleep 1
> Cannot resolve IDs for tma_info_inst_mix_iparith: INST_RETIRED.ANY /
> (FP_ARITH_INST_RETIRED.SCALAR + FP_ARITH_INST_RETIRED.VECTOR)
>
>
> The patch set add the tma_info_inst_mix_iparith for cpu_atom.
>
> + {
> + "BriefDescription": "Instructions per FP Arithmetic instruction
> (lower number means higher occurrence rate)",
> + "MetricExpr": "INST_RETIRED.ANY / (FP_ARITH_INST_RETIRED.SCALAR
> + FP_ARITH_INST_RETIRED.VECTOR)",
> + "MetricGroup": "Flops;InsType;Inst_Metric",
> + "MetricName": "tma_info_inst_mix_iparith",
> + "MetricThreshold": "tma_info_inst_mix_iparith < 10",
> + "PublicDescription": "Instructions per FP Arithmetic
> instruction (lower number means higher occurrence rate). Values < 1 are
> possible due to intentional FMA double counting. Approximated prior to BDW",
> + "Unit": "cpu_atom"
> + },
>
> However, the FP_ARITH_INST_RETIRED.SCALAR and
> FP_ARITH_INST_RETIRED.VECTOR event are only available for cpu_core.
>
> sudo ./perf stat -e FP_ARITH_INST_RETIRED.SCALAR -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 0 cpu_core/FP_ARITH_INST_RETIRED.SCALAR/
>
> There should be no such metric for cpu_atom.
Thanks Thomas and Kan!
The metric came from here:
https://github.com/intel/perfmon/blob/main/ADL/metrics/perf/alderlake_metrics_goldencove_core_perf.json#L1243
and the event being in the cpu_core will explain why it passed the
sanity check that all events are in the event json.
I believe Caleb can address the issue. I think all the events need PMU
prefixes as happened previously here:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1535
A few more kinks to resolve in the new TMA release process, thanks for
the testing!
Ian