RE: [RFC PATCH v6 2/5] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric.

From: Wang, Weilin
Date: Wed Apr 24 2024 - 13:08:36 EST

Next message: Jonathan Cameron: "Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs"
Previous message: Junio C Hamano: "[ANNOUNCE] Git v2.45.0-rc1"
Next in thread: Namhyung Kim: "Re: [RFC PATCH v6 2/5] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> -----Original Message-----
> From: Namhyung Kim <namhyung@xxxxxxxxxx>
> Sent: Tuesday, April 23, 2024 4:06 PM
> To: Wang, Weilin <weilin.wang@xxxxxxxxx>
> Cc: Ian Rogers <irogers@xxxxxxxxxx>; Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Ingo Molnar
> <mingo@xxxxxxxxxx>; Alexander Shishkin
> <alexander.shishkin@xxxxxxxxxxxxxxx>; Jiri Olsa <jolsa@xxxxxxxxxx>; Hunter,
> Adrian <adrian.hunter@xxxxxxxxx>; Kan Liang <kan.liang@xxxxxxxxxxxxxxx>;
> linux-perf-users@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Taylor, Perry
> <perry.taylor@xxxxxxxxx>; Alt, Samantha <samantha.alt@xxxxxxxxx>; Biggers,
> Caleb <caleb.biggers@xxxxxxxxx>
> Subject: Re: [RFC PATCH v6 2/5] perf stat: Fork and launch perf record when
> perf stat needs to get retire latency value for a metric.
>
> On Tue, Apr 23, 2024 at 3:16 PM Wang, Weilin <weilin.wang@xxxxxxxxx>
> wrote:
> > > > > > -static int __run_perf_record(void)
> > > > > > +static int __run_perf_record(const char **record_argv)
> > > > > > {
> > > > > > + int i = 0;
> > > > > > + struct tpebs_event *e;
> > > > > > +
> > > > > > pr_debug("Prepare perf record for retire_latency\n");
> > > > > > +
> > > > > > + record_argv[i++] = "perf";
> > > > > > + record_argv[i++] = "record";
> > > > > > + record_argv[i++] = "-W";
> > > > > > + record_argv[i++] = "--synth=no";
> > > > > > +
> > > > > > + if (stat_config.user_requested_cpu_list) {
> > > > > > + record_argv[i++] = "-C";
> > > > > > + record_argv[i++] = stat_config.user_requested_cpu_list;
> > > > > > + }
> > > > > > +
> > > > > > + if (stat_config.system_wide)
> > > > > > + record_argv[i++] = "-a";
> > > > > > +
> > > > > > + if (!stat_config.system_wide
> > > && !stat_config.user_requested_cpu_list)
> > > > > {
> > > > > > + pr_err("Require -a or -C option to run sampling.\n");
> > > > > > + return -ECANCELED;
> > > > > > + }
> > > > > > +
> > > > > > + list_for_each_entry(e, &stat_config.tpebs_events, nd) {
> > > > > > + record_argv[i++] = "-e";
> > > > > > + record_argv[i++] = e->name;
> > > > > > + }
> > > > > > +
> > > > > > + record_argv[i++] = "-o";
> > > > > > + record_argv[i++] = PERF_DATA;
> > > > > > +
> > > > > > return 0;
> > > > > > }
> > > > >
> > > > > Still I think it's weird it has 'perf record' in perf stat (despite the
> > > > > 'perf stat record'). If it's only Intel thing and we don't have a plan
> > > > > to do the same on other arches, we can move it to the arch
> > > > > directory and keep the perf stat code simple.
> > > >
> > > > I'm not sure what is the proper way to solve this. And Ian mentioned
> > > > that put code in arch directory could potentially cause other bugs.
> > > > So I'm wondering if we could keep this code here for now. I could work
> > > > on it later if we found it's better to be in arch directory.
> > >
> > > Maybe somewhere in the util/ and keep the main code minimal.
> > > IIUC it's only for very recent (or upcoming?) Intel CPUs and we
> > > don't have tests (hopefully can run on other arch/CPUs).
> > >
> > > So I don't think having it here would help fixing potential bugs.
> >
> > Do you mean by creating a new file in util/ to hold this code?
>
> Yeah, maybe util/intel-tpebs.c (if it's better than arch/x86/...) ?
>
> >
> > Yes, this feature is for very recent Intel CPUs. It should only be triggered if
> > a metric uses event(s) that has the R modifier in the formula.
>
> Can we have a test with a fake metric so that we can test
> the code on non-(or old-)Intel machines?

All the existing metrics in non-(or old-)Intel CPUs should work as usual. So I think
existing metric tests should work for it.

If we want to add a fake metric uses the :R modifier for those platforms, the metric
should either fail (if the fake metric uses an event not exist on the test platform) or
return all 0 retire latency data.

So, I'm not quite sure what we want the fake metric to test for. Maybe we could
continue using existing metric tests to ensure all the defined metrics work correctly
in each machine under test?

Thanks,
Weilin

>
> Thanks,
> Namhyung

Next message: Jonathan Cameron: "Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs"
Previous message: Junio C Hamano: "[ANNOUNCE] Git v2.45.0-rc1"
Next in thread: Namhyung Kim: "Re: [RFC PATCH v6 2/5] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]