Re: [PATCH 1/2] perf evlist: Keep topdown counters in weak group

From: Ian Rogers
Date: Mon May 09 2022 - 13:28:59 EST


On Thu, May 5, 2022 at 12:44 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 5/5/2022 2:31 PM, Ian Rogers wrote:
> >>> So I think fixing all of these should be a follow up. I am working to
> >>> get access to an Alderlake system, could we land this first?
> >>>
> >> I think we can use pmu_name to replace the "cpu" to fix the issue for
> >> the hybrid platform. For a hybrid platform, the pmu_name is either
> >> cpu_atom or cpu_core.
> >>
> >> Besides, the topdown events may have a PMU prefix, e.g.,
> >> cpu_core/topdown-be-bound/. The strcasecmp may not work well for this case.
> >>
> >> How about the below patch?
> >> If it's OK for you, could you please merge it into your V2 patch set?
> >> I can do the test on a ADL system.
> >>
> >> diff --git a/tools/perf/arch/x86/util/evsel.c
> >> b/tools/perf/arch/x86/util/evsel.c
> >> index 40b171de2086..551ae2bab70e 100644
> >> --- a/tools/perf/arch/x86/util/evsel.c
> >> +++ b/tools/perf/arch/x86/util/evsel.c
> >> @@ -33,11 +33,12 @@ void arch_evsel__fixup_new_cycles(struct
> >> perf_event_attr *attr)
> >>
> >> bool arch_evsel__must_be_in_group(const struct evsel *evsel)
> >> {
> >> - if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
> >> - !pmu_have_event("cpu", "slots"))
> >> + const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
> >> +
> >> + if (!pmu_have_event(pmu_name, "slots"))
> >> return false;
> > Hmm. The idea with this test is to see if the architecture supports
> > topdown events before going further. There's a similar test in all the
> > arch_evlist functions. I think with cpu_core this needs to become:
> >
>
> The case is a little bit different here. For the arch_evlist functions,
> the input is the evlist, not the specific evsel. So we have to check all
> the possible PMU names which are "cpu" and "cpu_core". Then we decide
> whether going further.
>
> The input of the evsel__must_be_in_group() is the evsel. The PMU name is
> stored in the evsel->pmu_name. I don't think we need to check all the
> possible PMU names. Using evsel->pmu_name should be good enough.
>
> > if (!pmu_have_event("cpu", "slots") && !pmu_have_event("cpu_core", "slots") )
> >
> > But we should add a helper function for this. It is odd to have this
> > change supporting Alderlake but the existing evlist work not. Perhaps
> > we should just wait until Zhengjun's patches land.
>
> Yes, a helper function is good for the arch_evlist functions. But I
> don't think this patch needs the helper function. Zhengjun's patches are
> to fix the other topdown issues on ADL. There is no dependency between
> this patch and zhengjun's patches.
>
> Thanks,
> Kan

TL;DR I think we can move forward with landing these patches to fix Icelake.

For Alderlake/hybrid we have a problem. To determine what happens with
grouping we need to know does the CPU have topdown events? This is a
runtime question for doing perf_event_open and so an arch test and
weak symbol are appropriate. For Icelake we are determining the
presence of topdown events by looking at the special PMU cpu. For
Alderlake the same information can be found by looking at the PMUs
cpu_core and cpu_atom, but how to discover those PMU names? It is
already somewhat concerning that we've hard coded "cpu" and we don't
want to have an ever growing list of PMU names.

We have similarly hard coded "cpu" in the topology code here:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/cputopo.c?h=tmp.perf/core#n18
Is this unreasonable given cpu is already supposed to be ABI stable:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/Documentation/ABI/stable/sysfs-devices-system-cpu?h=tmp.perf/core

It is hard to say what the right hybrid fix is here. I should get a
system I can poke shortly. I'd also like to compare what's in sysfs
for Alderlake with ARM's big.little approach. I can imagine we need a
function that returns a list of CPU like PMUs for probing. Ideally we
could work this out from sysfs and use some stable ABI.

Thanks,
Ian