RE: [RFC PATCH v6 4/5] perf stat: Add retire latency print functions to print out at the very end of print out

From: Wang, Weilin
Date: Mon Apr 01 2024 - 17:41:40 EST




> -----Original Message-----
> From: Namhyung Kim <namhyung@xxxxxxxxxx>
> Sent: Monday, April 1, 2024 2:15 PM
> To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> Cc: Ian Rogers <irogers@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> <mingo@xxxxxxxxxx>; Alexander Shishkin
> <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> Caleb <caleb.biggers@xxxxxxxxx>
> Subject: Re: [RFC PATCH v6 4/5] perf stat: Add retire latency print functions to
> print out at the very end of print out
>
> On Mon, Apr 1, 2024 at 2:08 PM Wang, Weilin <weilin.wang@xxxxxxxxx>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Namhyung Kim <namhyung@xxxxxxxxxx>
> > > Sent: Monday, April 1, 2024 2:04 PM
> > > To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> > > Cc: Ian Rogers <irogers@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> > > <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> > > <mingo@xxxxxxxxxx>; Alexander Shishkin
> > > <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> > > Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> > > linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor,
> Perry
> > > <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>;
> Biggers,
> > > Caleb <caleb.biggers@xxxxxxxxx>
> > > Subject: Re: [RFC PATCH v6 4/5] perf stat: Add retire latency print
> functions to
> > > print out at the very end of print out
> > >
> > > On Fri, Mar 29, 2024 at 12:12 PM <weilin.wang@xxxxxxxxx> wrote:
> > > >
> > > > From: Weilin Wang <weilin.wang@xxxxxxxxx>
> > > >
> > > > Add print out functions so that users could read retire latency values.
> > > >
> > > > Example output:
> > > > In this simple example, there is no
> MEM_INST_RETIRED.STLB_HIT_STORES
> > > sample.
> > > > Therefore, the MEM_INST_RETIRED.STLB_HIT_STORES:p retire_latency
> > > value, count
> > > > and sum are all 0.
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > > 181,047,168 cpu_core/TOPDOWN.SLOTS/ # 0.6 %
> > > tma_dtlb_store
> > > > 3,195,608 cpu_core/topdown-retiring/
> > > > 40,156,649 cpu_core/topdown-mem-bound/
> > > > 3,550,925 cpu_core/topdown-bad-spec/
> > > > 117,571,818 cpu_core/topdown-fe-bound/
> > > > 57,118,087 cpu_core/topdown-be-bound/
> > > > 69,179 cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > > > 4,582 cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > > > 30,183,104 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > > > 30,556,790 cpu_core/CPU_CLK_UNHALTED.THREAD/
> > > > 168,486 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > > > 0.00 MEM_INST_RETIRED.STLB_HIT_STORES:p 0 0
> > >
> > > The output is not aligned and I think it's hard to read.
> > > I think it should print the result like this:
> > >
> > > <sum> <event-name> # <val> average retired latency
> >
> > Since we would like to use the average retire latency, I would think put
> average
> > at the beginning would be more consistent. So in format like:
> > <val> <event-name> <sum> <count> or <val> <event-name> <count>
> <sum> ?
>
> But it's not consistent with others. When I see the perf stat
> output, I'd expect it shows the total count. And the average
> latency is a derived value so I think it can be treated as a metric.

I think whether it is consistent or not depends on how we read this data.
If there is multiplexing happening, would the total count value of events be
the scaled counts or raw counts? If these are scaled counts, then these are
derived value as well. But we do expect the first column shows the value we
care most from the row.

Thanks,
Weilin

>
> Thanks,
> Namhyung