RE: [RFC PATCH v18 3/8] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric.
From: Wang, Weilin
Date: Mon Aug 05 2024 - 16:34:54 EST
> -----Original Message-----
> From: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Sent: Monday, August 5, 2024 1:20 PM
> To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>; Ian Rogers
> <irogers@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> <mingo@xxxxxxxxxx>; Alexander Shishkin
> <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> Caleb <caleb.biggers@xxxxxxxxx>
> Subject: Re: [RFC PATCH v18 3/8] perf stat: Fork and launch perf record when
> perf stat needs to get retire latency value for a metric.
>
> On Mon, Aug 05, 2024 at 04:43:06PM -0300, Arnaldo Carvalho de Melo
> wrote:
> > On Mon, Aug 05, 2024 at 04:40:37PM -0300, Arnaldo Carvalho de Melo
> wrote:
> > > On Sat, Jul 20, 2024 at 02:20:56AM -0400, weilin.wang@xxxxxxxxx wrote:
> > > > From: Weilin Wang <weilin.wang@xxxxxxxxx>
> > > >
> > > > When retire_latency value is used in a metric formula, evsel would fork a
> perf
> > > > record process with "-e" and "-W" options. Perf record will collect
> required
> > > > retire_latency values in parallel while perf stat is collecting counting
> values.
> > > >
> > > > At the point of time that perf stat stops counting, evsel would stop perf
> record
> > > > by sending sigterm signal to perf record process. Sampled data will be
> process
> > > > to get retire latency value. Another thread is required to synchronize
> between
> > > > perf stat and perf record when we pass data through pipe.
> > > >
> > > > Retire_latency evsel is not opened for perf stat so that there is no counter
> > > > wasted on it. This commit includes code suggested by Namhyung to
> adjust reading
> > > > size for groups that include retire_latency evsels.
> > >
> > > Failing at this point:
> > >
> > > ⬢[acme@toolbox perf-tools-next]$ git log --oneline -5
> > > 13430131acc4f88b (HEAD) perf stat: Fork and launch perf record when
> perf stat needs to get retire latency value for a metric.
> > > b7b9adefb5d57aaf perf data: Allow to use given fd in data->file.fd
> > > 3a442bf266d1f3c7 perf parse-events: Add a retirement latency modifier
> > > ce533c9bc6deb125 (perf-tools-next.korg/tmp.perf-tools-next,
> acme.korg/tmp.perf-tools-next) perf annotate: Add --skip-empty option
> > > bb588e38290fb723 perf annotate: Set al->data_nr using the notes->src-
> >nr_events
> > > ⬢[acme@toolbox perf-tools-next]$
> > >
> > > I'll see if when a followup patch gets applied this gets solved, if so
> > > will try to fixup things or ask for help, since this seems to be
> > > breaking 'git bisect' for this codebase.
> >
> > Indeed, when the next patch gets applied it builds without problems.
> > I.e. patch 4/8 fixes problems in patch 3/8, maybe just combine them
> > into one single patch?
>
> I have everything in the tmp.perf-tools-next branch at:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
>
> I'll check later or tomorrow if I can fixup the bisection breakage
> described above or if just sqashing together 3/8 with 4/8 is better,
> please advise.
>
Squashing 3/8 and 4/8 looks good to me. We would then have all the major functions
in one patch.
Thanks,
Weilin
> Then I'll move it to the perf-tools-next branch.
>
> - Arnaldo