Re: [PATCH v3 2/3] perf pmus: Simplify perf_pmus__find_core_pmu()

From: James Clark
Date: Fri Sep 15 2023 - 07:18:14 EST




On 13/09/2023 16:33, James Clark wrote:
> Currently the while loop always either exits on the first iteration with
> a core PMU, or exits with NULL on heterogeneous systems or when not all
> CPUs are online.
>
> Both of the latter behaviors are undesirable for platforms other than
> Arm so simplify it to always return the first core PMU, or NULL if none
> exist.
>
> This behavior was depended on by the Arm version of
> pmu_metrics_table__find(), so the logic has been moved there instead.
>
> Suggested-by: Ian Rogers <irogers@xxxxxxxxxx>
> Reviewed-by: Ian Rogers <irogers@xxxxxxxxxx>
> Signed-off-by: James Clark <james.clark@xxxxxxx>

Turns out the "Simple expression parser" test is failing on
heterogeneous arm systems without this patch. I didn't realise there was
a dependency and should have put the commits the other way round. I will
leave the error message here in case someone bumps into it, but no fix
is required apart from applying the remaining patches in this set:

$ perf test expr -v
4: Simple expression parser :
--- start ---
test child forked, pid 4902
Using CPUID 0x00000000410fd070
FAILED tests/expr.c:83 get_cpuid
test child finished with -1
---- end ----
Simple expression parser: FAILED!


> ---
> tools/perf/arch/arm64/util/pmu.c | 8 +++++++-
> tools/perf/util/pmus.c | 14 +-------------
> 2 files changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
> index 3d9330feebd2..3099f5f448ba 100644
> --- a/tools/perf/arch/arm64/util/pmu.c
> +++ b/tools/perf/arch/arm64/util/pmu.c
> @@ -10,8 +10,14 @@
>
> const struct pmu_metrics_table *pmu_metrics_table__find(void)
> {
> - struct perf_pmu *pmu = perf_pmus__find_core_pmu();
> + struct perf_pmu *pmu;
> +
> + /* Metrics aren't currently supported on heterogeneous Arm systems */
> + if (perf_pmus__num_core_pmus() > 1)
> + return NULL;
>
> + /* Doesn't matter which one here because they'll all be the same */
> + pmu = perf_pmus__find_core_pmu();
> if (pmu)
> return perf_pmu__find_metrics_table(pmu);
>
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index cec869cbe163..64e798e68a2d 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -596,17 +596,5 @@ struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
>
> struct perf_pmu *perf_pmus__find_core_pmu(void)
> {
> - struct perf_pmu *pmu = NULL;
> -
> - while ((pmu = perf_pmus__scan_core(pmu))) {
> - /*
> - * The cpumap should cover all CPUs. Otherwise, some CPUs may
> - * not support some events or have different event IDs.
> - */
> - if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
> - return NULL;
> -
> - return pmu;
> - }
> - return NULL;
> + return perf_pmus__scan_core(NULL);
> }