Re: [PATCH v5 11/24] perf vendor events: Update/add Graniterapids events/metrics

From: Ian Rogers
Date: Thu Feb 06 2025 - 11:40:48 EST


On Thu, Feb 6, 2025 at 6:32 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
> On 2025-02-05 4:33 p.m., Ian Rogers wrote:
> > On Wed, Feb 5, 2025 at 1:10 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>
> >> On 2025-02-05 3:23 p.m., Ian Rogers wrote:
> >>> On Wed, Feb 5, 2025 at 11:11 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 2025-02-05 12:31 p.m., Ian Rogers wrote:
> >>>>> + {
> >>>>> + "BriefDescription": "This category represents fraction of slots utilized by useful work i.e. issued uops that eventually get retired",
> >>>>> + "MetricExpr": "topdown\\-retiring / (topdown\\-fe\\-bound + topdown\\-bad\\-spec + topdown\\-retiring + topdown\\-be\\-bound) + 0 * slots",
> >>>>> + "MetricGroup": "BvUW;TmaL1;TopdownL1;tma_L1_group",
> >>>>> + "MetricName": "tma_retiring",
> >>>>> + "MetricThreshold": "tma_retiring > 0.7 | tma_heavy_operations > 0.1",
> >>>>> + "MetricgroupNoGroup": "TopdownL1",
> >>>>> + "PublicDescription": "This category represents fraction of slots utilized by useful work i.e. issued uops that eventually get retired. Ideally; all pipeline slots would be attributed to the Retiring category. Retiring of 100% would indicate the maximum Pipeline_Width throughput was achieved. Maximizing Retiring typically increases the Instructions-per-cycle (see IPC metric). Note that a high Retiring value does not necessary mean there is no room for more performance. For example; Heavy-operations or Microcode Assists are categorized under Retiring. They often indicate suboptimal performance and can often be optimized or avoided. Sample with: UOPS_RETIRED.SLOTS",
> >>>>> + "ScaleUnit": "100%"
> >>>>> + },
> >>>>
> >>>> The "Default" tag is missed for GNR as well.
> >>>> It seems the new CPUIDs are not added in the script?
> >>>
> >>> Spotted it, we need to manually say which architectures with TopdownL1
> >>> should be in Default because it was insisted upon that pre-Icelake
> >>> CPUs with TopdownL1 not have TopdownL1 in Default. As you know, my
> >>> preference would be to always put TopdownL1 metrics into Default.
> >>>
> >>
> >> For the future platforms, there should be always at least TopdownL1
> >> support. Intel even adds extra fixed counters for the TopdownL1 events.
> >>
> >> Maybe the script should be changed to only mark the old pre-Icelake as
> >> no TopdownL1 Default. For the other platforms, always add TopdownL1 as
> >> Default. It would avoid manually adding it for every new platforms.
> >
> > That's fair. What about TopdownL2 that is currently only in the
> > Default set for SPR?
> >
>
> Yes, the TopdownL2 is a bit tricky, which requires much more events.
> Could you please set it just for SPR/EMR/GNR for now?
>
> I will ask around internally and make a long-term solution for the
> TopdownL2.

Thanks Kan, I've updated the script the existing way for now. Thomas
saw another issue with TSC which is also fixed. I'm trying to
understand what happened with it before sending out v6:
https://lore.kernel.org/lkml/4f42946ffdf474fbf8aeaa142c25a25ebe739b78.camel@xxxxxxxxx/
"""
There are all some errors like this,

Testing tma_cisc
Metric contains missing events
Cannot resolve IDs for tma_cisc: cpu_atom@TOPDOWN_FE_BOUND.CISC@ / (5
* cpu_atom@CPU_CLK_UNHALTED.CORE@)
"""
But checking the json I wasn't able to spot a model with the metric
and without these json events. Knowing the model would make my life
easier :-)

Thanks,
Ian