Re: [PATCH v6 1/2] perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu

From: Aishwarya TCV
Date: Wed Jun 12 2024 - 07:19:56 EST




On 15/05/2024 07:01, Ian Rogers wrote:
> The mrvl_ddr_pmu is uncore and has a hexadecimal address suffix while
> the previous PMU sorting/merging code assumes uncore PMU names start
> with uncore_ and have a decimal suffix. Because of the previous
> assumption it isn't possible to wildcard the mrvl_ddr_pmu.
>
> Modify pmu_name_len_no_suffix but also remove the suffix number out
> argument, this is because we don't know if a suffix number of say 100
> is in hexadecimal or decimal. As the only use of the suffix number is
> in comparisons, it is safe there to compare the values as hexadecimal.
> Modify perf_pmu__match_ignoring_suffix so that hexadecimal suffixes
> are ignored.
>
> Only allow hexadecimal suffixes to be greater than length 2 (ie 3 or
> more) so that S390's cpum_cf PMU doesn't lose its suffix.
>
> Change the return type of pmu_name_len_no_suffix to size_t to
> workaround GCC incorrectly determining the result could be negative.
>
> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> ---
> tools/perf/util/pmu.c | 33 +++++++++++++--------
> tools/perf/util/pmus.c | 67 ++++++++++++++++++++++++------------------
> tools/perf/util/pmus.h | 7 ++++-
> 3 files changed, 65 insertions(+), 42 deletions(-)
>

Hi Ian,

Perf test "perf_all_PMU_test" is failing when run against
next-master(next-20240612) kernel with Arm64 on JUNO in our CI. It looks
like it is failing when run on JUNO alone. Verified by running on other
boards like RB5 and Ampere_altra and confirming that it does not fail on
these boards. Suspecting that the suffixed 'armv8_pmuv3_0' naming could
be the reason of test failure.

Reverting the change (3241d46f5f54) seems to fix it.

This works fine on Linux version v6.10-rc3

Failure log
------------
110: perf all PMU test:
--- start ---
test child forked, pid 8279
Testing armv8_pmuv3/br_immed_retired/
Event 'armv8_pmuv3/br_immed_retired/' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
Average synthesis took: 1169.431 usec (+- 0.144 usec)
Average num. events: 35.000 (+- 0.000)
Average time per event 33.412 usec
Average data synthesis took: 1225.698 usec (+- 0.102 usec)
Average num. events: 119.000 (+- 0.000)
Average time per event 10.300 usec

Performance counter stats for 'perf bench internals synthesize':

3263664785 armv8_pmuv3_0/br_immed_retired/


25.472854464 seconds time elapsed

8.004791000 seconds user
17.060209000 seconds sys
---- end(-1) ----
110: perf all PMU test :
FAILED!

Thanks,
Aishwarya