Re: [RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording

From: Namhyung Kim
Date: Tue Jun 11 2024 - 14:49:38 EST


On Tue, Jun 11, 2024 at 10:49 AM Wang, Weilin <weilin.wang@xxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Namhyung Kim <namhyung@xxxxxxxxxx>
> > Sent: Tuesday, June 11, 2024 9:50 AM
> > To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> > Cc: Ian Rogers <irogers@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> > <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> > <mingo@xxxxxxxxxx>; Alexander Shishkin
> > <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> > Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> > linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> > <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> > Caleb <caleb.biggers@xxxxxxxxx>
> > Subject: Re: [RFC PATCH v12 6/8] perf stat: Add command line option for
> > enabling tpebs recording
> >
> > Hello,
> >
> > On Mon, Jun 10, 2024 at 10:24 PM <weilin.wang@xxxxxxxxx> wrote:
> > >
> > > From: Weilin Wang <weilin.wang@xxxxxxxxx>
> > >
> > > With this command line option, tpebs recording is turned off in perf stat on
> > > default. It will only be turned on when this option is given in perf stat
> > > command.
> > >
> > > Exampe with --enable-tpebs-recording:
> > >
> > > perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1
> > >
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.000 MB - ]
> > >
> > > Performance counter stats for 'system wide':
> > >
> > > 181,047,168 cpu_core/TOPDOWN.SLOTS/ # 0.6 %
> > tma_dtlb_store
> > > 3,195,608 cpu_core/topdown-retiring/
> > > 40,156,649 cpu_core/topdown-mem-bound/
> > > 3,550,925 cpu_core/topdown-bad-spec/
> > > 117,571,818 cpu_core/topdown-fe-bound/
> > > 57,118,087 cpu_core/topdown-be-bound/
> > > 69,179 cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > > 4,582 cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > > 30,183,104 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > > 30,556,790 cpu_core/CPU_CLK_UNHALTED.THREAD/
> > > 168,486 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > > 0 MEM_INST_RETIRED.STLB_HIT_STORES:R
> >
> > Here I guess we can expect a non-zero value, right?
>
> Hi Namhyung,
>
> Do you mean when we have the option, we should expect non-zero value?
> During the execution, it's possible that we don't capture any sample on this
> event. In this case, we would have a zero value. In the future, I think we will
> give it the default value instead of zero.

I mean there's a chance you get non-zero, but it can be zero sometimes, right?
Then I think it's better to choose non-zero in this example otherwise it's not
clear what's the difference of using this option or not when you look at the
output in this commit message (they are both 0).

Thanks,
Namhyung

>
> >
> > >
> > > 1.003105924 seconds time elapsed
> > >
> > > Exampe without --enable-tpebs-recording:
> > >
> > > perf stat -M tma_dtlb_store -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > > 181,047,168 cpu_core/TOPDOWN.SLOTS/ # 0.6 %
> > tma_dtlb_store
> > > 3,195,608 cpu_core/topdown-retiring/
> > > 40,156,649 cpu_core/topdown-mem-bound/
> > > 3,550,925 cpu_core/topdown-bad-spec/
> > > 117,571,818 cpu_core/topdown-fe-bound/
> > > 57,118,087 cpu_core/topdown-be-bound/
> > > 69,179 cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > > 4,582 cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > > 30,183,104 cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > > 30,556,790 cpu_core/CPU_CLK_UNHALTED.THREAD/
> > > 168,486 cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > > 0 MEM_INST_RETIRED.STLB_HIT_STORES:R
> > >
> > > 1.003105924 seconds time elapsed
> > >
> > > Signed-off-by: Weilin Wang <weilin.wang@xxxxxxxxx>
> > > ---
> > > tools/perf/builtin-stat.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index 14be132f7b6f..055139e44763 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -1235,6 +1235,8 @@ static struct option stat_options[] = {
> >
> > What's the base of your patchset? I think we moved this to cmd_stat().
> > Please make sure to rebase on the perf-tools-next.
>
> This was rebased on top of Ian's change for the tool event and retire_latency parser
> patches. I think that was on tmp.perf-tools.next. I will rebase on perf-tools-next.
>
> >
> >
> > > "disable adding events for the metric threshold calculation"),
> > > OPT_BOOLEAN(0, "topdown", &topdown_run,
> > > "measure top-down statistics"),
> > > + OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
> > > + "enable recording for tpebs when retire_latency required"),
> >
> > Please update the documentation (man page) for this option.
> >
> > Also I'm afraid it'd fail to build on non-x86 because tpebes_recording
> > is defined in intel-tpebs.c.
> >
>
> Yes, you are right. I forgot about non-x86. I will update this and also the
> Documentation.
>
> Thanks,
> Weilin
>
> > Thanks,
> > Namhyung
> >
> >
> > > OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
> > > "Set the metrics level for the top-down statistics (0: max level)"),
> > > OPT_BOOLEAN(0, "smi-cost", &smi_cost,
> > > --
> > > 2.43.0
> > >