Re: [RFC PATCH v10 3/8] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric.

From: Namhyung Kim
Date: Fri May 31 2024 - 17:40:47 EST


On Thu, May 30, 2024 at 11:46 PM Wang, Weilin <weilin.wang@xxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Namhyung Kim <namhyung@xxxxxxxxxx>
> > Sent: Thursday, May 30, 2024 11:41 PM
> > To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> > Cc: Ian Rogers <irogers@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> > <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> > <mingo@xxxxxxxxxx>; Alexander Shishkin
> > <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> > Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> > linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> > <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> > Caleb <caleb.biggers@xxxxxxxxx>
> > Subject: Re: [RFC PATCH v10 3/8] perf stat: Fork and launch perf record when
> > perf stat needs to get retire latency value for a metric.
> >
> > On Tue, May 28, 2024 at 11:43 PM <weilin.wang@xxxxxxxxx> wrote:
> > >
> > > From: Weilin Wang <weilin.wang@xxxxxxxxx>
> > >
> > > When retire_latency value is used in a metric formula, evsel would fork a perf
> > > record process with "-e" and "-W" options. Perf record will collect required
> > > retire_latency values in parallel while perf stat is collecting counting values.
> > >
> > > At the point of time that perf stat stops counting, evsel would stop perf
> > record
> > > by sending sigterm signal to perf record process. Sampled data will be
> > process
> > > to get retire latency value.
> > >
> > > Another thread is required to synchronize between perf stat and perf record
> > > when we pass data through pipe.
> > >
> > > Signed-off-by: Weilin Wang <weilin.wang@xxxxxxxxx>
> > > ---
> > [SNIP]
> > > +int tpebs_set_evsel(struct evsel *evsel, int cpu_map_idx, int thread)
> > > +{
> > > + struct perf_counts_values *count;
> > > + struct tpebs_retire_lat *t;
> > > + bool found = false;
> > > + __u64 val;
> > > + int ret;
> > > +
> > > + /* Non reitre_latency evsel should never enter this function. */
> > > + if (!evsel__is_retire_lat(evsel))
> > > + return -1;
> > > +
> > > + ret = tpebs_stop();
> > > + if (ret)
> > > + return ret;
> > > +
> > > + count = perf_counts(evsel->counts, cpu_map_idx, thread);
> > > +
> > > + list_for_each_entry(t, &tpebs_results, nd) {
> > > + if (!strcmp(t->tpebs_name, evsel->name) || !strcmp(t-
> > >tpebs_name, evsel->metric_id)) {
> > > + found = true;
> > > + break;
> > > + }
> > > + }
> > > +
> > > + /* Set ena and run to non-zero */
> > > + count->ena = count->run = 1;
> > > + count->lost = 0;
> > > +
> > > + if (!found) {
> > > + /*
> > > + * Set default value or 0 when retire_latency for this event is
> > > + * not found from sampling data (enable_tpebs_recording not set
> > > + * or 0 sample recorded).
> > > + */
> > > + val = 0;
> > > + return 0;
> > > + }
> > > +
> > > + /*
> > > + * Only set retire_latency value to the first CPU and thread.
> > > + */
> > > + if (cpu_map_idx == 0 && thread == 0) {
> > > + /* Lost precision when casting from double to __u64. Any
> > improvement? */
> >
> > As I said before I think you can set t->val * 1000 and then
> > set the evsel->scale to 1e3 or 1e-3.
>
> Hi Namhyung,
>
> Sorry if this is a repeated message. I thought I replied to your suggestion
> on this last time. I'm thinking we should keep it like this for now and make
> this change unless we find the precision loss is critical. Because I thought
> we don't want to add special code to handle the calculation and final print
> to keep code simple.
>
> I kept this comment here so that we don't forget about it. Please let me
> know if you'd like me to remove it.

Please see print_counter_aggrdata(). It's the generic code to print
the event value and it'll display the value multiplied by the scale
(default to 1.0). So you can keep precision as long as you set the
scale properly (1e-3).

Thanks,
Namhyung