Re: [PATCH] perf test: Retry without grouping for all metrics test

From: Arnaldo Carvalho de Melo
Date: Wed Dec 06 2023 - 12:54:34 EST


Em Wed, Dec 06, 2023 at 08:35:23AM -0800, Ian Rogers escreveu:
> On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> > Humm, I'm not being able to reproduce here the problem, before applying
> > this patch:

> Please don't apply the patch. The patch masks a bug in metrics/PMUs

I didn't

> and the proper fix was:
> 8d40f74ebf21 perf vendor events amd: Fix large metrics
> https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@xxxxxxx

that is upstream:

⬢[acme@toolbox perf-tools-next]$ git log tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
commit 8d40f74ebf217d3b9e9b7481721e6236b857cc55
Author: Sandipan Das <sandipan.das@xxxxxxx>
Date: Thu Jul 6 12:04:40 2023 +0530

perf vendor events amd: Fix large metrics

There are cases where a metric requires more events than the number of
available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
data fabric counters but the "nps1_die_to_dram" metric has eight events.

By default, the constituent events are placed in a group and since the
events cannot be scheduled at the same time, the metric is not computed.
The "all metrics" test also fails because of this.

Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
the user to run perf with "--metric-no-group".

E.g.

$ sudo perf test -v 101

Before:

101: perf all metrics test :
--- start ---
test child forked, pid 37131
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Metric 'nps1_die_to_dram' not printed in:
Error:
Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with -1
---- end ----
perf all metrics test: FAILED!

After:

101: perf all metrics test :
--- start ---
test child forked, pid 43766
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok

Reported-by: Ayush Jain <ayush.jain3@xxxxxxx>
Suggested-by: Ian Rogers <irogers@xxxxxxxxxx>
Signed-off-by: Sandipan Das <sandipan.das@xxxxxxx>
Acked-by: Ian Rogers <irogers@xxxxxxxxxx>
Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Ananth Narayan <ananth.narayan@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ravi Bangoria <ravi.bangoria@xxxxxxx>
Cc: Santosh Shukla <santosh.shukla@xxxxxxx>
Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@xxxxxxx
Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx

> > Ian, I also stumbled on this:

> > [root@five ~]# perf stat -M dram_channel_data_controller_4
> > Cannot find metric or group `dram_channel_data_controller_4'
> > ^C
> > Performance counter stats for 'system wide':

> > 284,908.91 msec cpu-clock # 32.002 CPUs utilized
> > 6,485,456 context-switches # 22.763 K/sec
> > 719 cpu-migrations # 2.524 /sec
> > 32,800 page-faults # 115.125 /sec

<SNIP>

> > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no?

> We could. I suspect the code has always just not bailed out. I'll put
> together a patch adding the bail out.

Great, thanks,

- Arnaldo