Re: [RFC PATCH v8 5/7] perf stat: Add retire latency print functions to print out at the very end of print out

From: Ian Rogers
Date: Thu May 16 2024 - 14:08:20 EST


On Thu, May 16, 2024 at 10:52 AM Wang, Weilin <weilin.wang@xxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Ian Rogers <irogers@xxxxxxxxxx>
> > Sent: Thursday, May 16, 2024 9:47 AM
> > To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> > Cc: Namhyung Kim <namhyung@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> > <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> > <mingo@xxxxxxxxxx>; Alexander Shishkin
> > <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> > Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> > linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> > <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> > Caleb <caleb.biggers@xxxxxxxxx>
> > Subject: Re: [RFC PATCH v8 5/7] perf stat: Add retire latency print functions to
> > print out at the very end of print out
> >
> > On Tue, May 14, 2024 at 10:44 PM <weilin.wang@xxxxxxxxx> wrote:
> > >
> > > From: Weilin Wang <weilin.wang@xxxxxxxxx>
> > >
> > > Add print out functions so that users could read retire latency values.
> > >
> > > Example output:
> > >
> > > Performance counter stats for 'system wide':
> > >
> > > 25,717 MEM_INST_RETIRED.SPLIT_STORES # 2.2 %
> > tma_split_stores
> > > 28,365,080 CPU_CLK_UNHALTED.THREAD
> > > 24.00 MEM_INST_RETIRED.SPLIT_STORES:R # 96 4
> > >
> > > 2.054365083 seconds time elapsed
> > >
> > > This output of retire latency data is in format:
> > > <val> <event-name:R> # <sum> <count>.
> > >
> > > Signed-off-by: Weilin Wang <weilin.wang@xxxxxxxxx>
> > > Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
> >
> > My usual complaint that I hate the stat-display spaghetti code. We
> > keep putting more spaghetti on the plate and this change does this
> > too. In the evsel approach:
> > https://lore.kernel.org/lkml/20240428053616.1125891-1-
> > irogers@xxxxxxxxxx/
> > retirement latency events just update the counts for the event and so
> > we don't need to special case tpebs events like this. I'd prefer we
> > went that route. My reviewed-by no longer stands.
> >
> Based on the current stat-display code and the original TPEBS counting
> code, I believe the code in this commit and last commit was the best way to
> do the metric calculation and printout.
>
> But I totally agree with you that if we could get evsel with retire latency work,
> this code is not necessary.
>
> I was thinking to plugin the retire latency value into evsel so that I could delete
> this and last commit. Do you think that would be a solution?

Yep, I think that'd be good.

Thanks,
Ian

> Thanks,
> Weilin
>
> > Thanks,
> > Ian
> >
> > > ---
> > > tools/perf/util/stat-display.c | 69
> > ++++++++++++++++++++++++++++++++++
> > > 1 file changed, 69 insertions(+)
> > >
> > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > > index bfc1d705f437..b9c3978cc99c 100644
> > > --- a/tools/perf/util/stat-display.c
> > > +++ b/tools/perf/util/stat-display.c
> > > @@ -21,6 +21,7 @@
> > > #include "iostat.h"
> > > #include "pmu.h"
> > > #include "pmus.h"
> > > +#include "intel-tpebs.h"
> > >
> > > #define CNTR_NOT_SUPPORTED "<not supported>"
> > > #define CNTR_NOT_COUNTED "<not counted>"
> > > @@ -34,6 +35,7 @@
> > > #define COMM_LEN 16
> > > #define PID_LEN 7
> > > #define CPUS_LEN 4
> > > +#define RETIRE_LEN 8
> > >
> > > static int aggr_header_lens[] = {
> > > [AGGR_CORE] = 18,
> > > @@ -426,6 +428,71 @@ static void print_metric_std(struct
> > perf_stat_config *config,
> > > fprintf(out, " %-*s", METRIC_LEN - n - 1, unit);
> > > }
> > >
> > > +static void print_retire_lat_std(struct perf_stat_config *config,
> > > + struct outstate *os)
> > > +{
> > > + FILE *out = os->fh;
> > > + bool newline = os->newline;
> > > + struct tpebs_retire_lat *t;
> > > + struct list_head *retire_lats = &config->tpebs_results;
> > > +
> > > + list_for_each_entry(t, retire_lats, nd) {
> > > + if (newline)
> > > + do_new_line_std(config, os);
> > > + fprintf(out, "%'*.2f ", COUNTS_LEN, t->val);
> > > + /* For print alignment */
> > > + fprintf(out, "%-*s ", config->unit_width, "");
> > > + fprintf(out, "%-*s", EVNAME_LEN, t->tpebs_name);
> > > + fprintf(out, " # ");
> > > + fprintf(out, "%*d %*ld\n", RETIRE_LEN, t->sum,
> > > + RETIRE_LEN, t->count);
> > > + }
> > > +}
> > > +
> > > +static void print_retire_lat_csv(struct perf_stat_config *config,
> > > + struct outstate *os)
> > > +{
> > > + FILE *out = os->fh;
> > > + struct tpebs_retire_lat *t;
> > > + struct list_head *retire_lats = &config->tpebs_results;
> > > + const char *sep = config->csv_sep;
> > > +
> > > + list_for_each_entry(t, retire_lats, nd) {
> > > + fprintf(out, "%f%s%s%s%s%ld%s%d\n", t->val, sep, sep, t-
> > >tpebs_name, sep,
> > > + t->count, sep, t->sum);
> > > + }
> > > +}
> > > +
> > > +static void print_retire_lat_json(struct perf_stat_config *config,
> > > + struct outstate *os)
> > > +{
> > > + FILE *out = os->fh;
> > > + struct tpebs_retire_lat *t;
> > > + struct list_head *retire_lats = &config->tpebs_results;
> > > +
> > > + fprintf(out, "{");
> > > + list_for_each_entry(t, retire_lats, nd) {
> > > + fprintf(out, "\"retire_latency-value\" : \"%f\", ", t->val);
> > > + fprintf(out, "\"name\" : \"%s\"", t->tpebs_name);
> > > + fprintf(out, "\"sample-counts\" : \"%ld\"", t->count);
> > > + fprintf(out, "\"retire_latency-sum\" : \"%d\"", t->sum);
> > > + }
> > > + fprintf(out, "}");
> > > +}
> > > +
> > > +static void print_retire_lat(struct perf_stat_config *config,
> > > + struct outstate *os)
> > > +{
> > > + if (!&config->tpebs_results)
> > > + return;
> > > + if (config->json_output)
> > > + print_retire_lat_json(config, os);
> > > + else if (config->csv_output)
> > > + print_retire_lat_csv(config, os);
> > > + else
> > > + print_retire_lat_std(config, os);
> > > +}
> > > +
> > > static void new_line_csv(struct perf_stat_config *config, void *ctx)
> > > {
> > > struct outstate *os = ctx;
> > > @@ -1609,6 +1676,8 @@ void evlist__print_counters(struct evlist *evlist,
> > struct perf_stat_config *conf
> > > break;
> > > }
> > >
> > > + print_retire_lat(config, &os);
> > > +
> > > print_footer(config);
> > >
> > > fflush(config->output);
> > > --
> > > 2.43.0
> > >