Re: [PATCH] perf tool: arch: arm64: change the way for get_cpuid_str() function

From: Will Deacon
Date: Fri Jul 19 2019 - 06:02:29 EST


On Fri, Jul 19, 2019 at 08:29:12AM +0000, Joakim Zhang wrote:
> > On Thu, Jul 18, 2019 at 06:21:30AM +0000, Joakim Zhang wrote:
> > > Now the get_cpuid__str function returns the MIDR string of the first
> > > online cpu from the range of cpus asscociated with the PMU CORE device.
> > >
> > > It can work when pass a perf_pmu entity to get_cpuid_str. However, it
> > > will pass NULL via perf_pmu__find_map from metricgroup.c if we want to
> > > add metric group for arm64. When pass NULL to get_cpuid_str, it can't
> > > return the MIDR string.
> >
> > Why is this code passing a NULL PMU? What information does it actually need?
> > Other functions, such as print_pmu_events(), iterate over all PMUs.
>
> tools/perf/utils/metricgroup.c:
> metricgroup__add_metric()/metricgroup__has_metric()/metricgroup__print()---->perf_pmu__find_map(NULL)------>perf_pmu__getcpuid(NULL)----->get_cpuid_str(NULL)
> So, it will eventually pass NULL to get_cpuid_str() function for arm64, and now get_cpuid_str() for arm64 need *struct perf_pmu* entity to return MIDR string.
>
> And the declaration for get_cpuid_str() is char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused) in tools/perf/utils/header.h file.
> So, it can return MIDR string without passing *struct perf_pmu* entity. Arch PowerPC/x86/s390 which implement metricgroup all add __maybe_unused.
>
> > > There are three methods from userspace getting MIDR string for arm64:
> > > 1. parse
> > > sysfs(/sys/devices/system/cpu/cpu?/regs/identification/midr_el1)
> > > 2. parse procfs(/proc/cpuinfo)
> > > 3. read the hwcaps(MIDR register) with getauxval(AT_HWCAP)
> > >
> > > Perfer to select #3 as it is more simple and direct.
> >
> > That's probably terminally broken for heterogeneous systems, so I'm not at all
> > happy with this patch.
>
> The implement of get_cpuid_str() for arm64 now only can return the MIDR
> string of the first online cpu. I think it is also broken for
> heterogeneous systems.

I disagree: the current code returns the MIDR string for the first online
CPU *corresponding to the PMU* (e.g. pmu->cpus) and therefore handles
heterogeneous systems like that.

> Will, do you know how to fix this issue more reasonable?

I still don't know what the metricgroup coce is trying to achieve, but my
previous suggestion about looking at print_pmu_events() is probably still a
good place to start.

Will