Re: [PATCH 1/4] perf evsel: Fixes topdown events in a weak group for the hybrid platform

From: Liang, Kan
Date: Fri May 13 2022 - 13:25:09 EST




On 5/13/2022 12:43 PM, Ian Rogers wrote:
On Fri, May 13, 2022 at 9:24 AM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:



On 5/13/2022 11:39 AM, Ian Rogers wrote:
On Fri, May 13, 2022 at 8:16 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:

From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.

$./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

Performance counter stats for 'system wide':

751,765,068 cpu_core/slots/ (84.07%)
<not supported> cpu_core/topdown-bad-spec/
<not supported> cpu_core/topdown-be-bound/
<not supported> cpu_core/topdown-fe-bound/
<not supported> cpu_core/topdown-retiring/
12,398,197 cpu_core/branch-instructions/ (84.07%)
1,054,218 cpu_core/branch-misses/ (84.24%)
539,764,637 cpu_core/bus-cycles/ (84.64%)
14,683 cpu_core/cache-misses/ (84.87%)
7,277,809 cpu_core/cache-references/ (77.30%)
222,299,439 cpu_core/cpu-cycles/ (77.28%)
63,661,714 cpu_core/instructions/ (84.85%)
0 cpu_core/mem-loads/ (77.29%)
12,271,725 cpu_core/mem-stores/ (77.30%)
542,241,102 cpu_core/ref-cycles/ (84.85%)
8,854 cpu_core/cache-misses/ (76.71%)
7,179,013 cpu_core/cache-references/ (76.31%)

1.003245250 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".

The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.

With the patch,

$perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

Performance counter stats for 'system wide':

766,620,266 cpu_core/slots/ (84.06%)
73,172,129 cpu_core/topdown-bad-spec/ # 9.5% bad speculation (84.06%)
193,443,341 cpu_core/topdown-be-bound/ # 25.0% backend bound (84.06%)
403,940,929 cpu_core/topdown-fe-bound/ # 52.3% frontend bound (84.06%)
102,070,237 cpu_core/topdown-retiring/ # 13.2% retiring (84.06%)
12,364,429 cpu_core/branch-instructions/ (84.03%)
1,080,124 cpu_core/branch-misses/ (84.24%)
564,120,383 cpu_core/bus-cycles/ (84.65%)
36,979 cpu_core/cache-misses/ (84.86%)
7,298,094 cpu_core/cache-references/ (77.30%)
227,174,372 cpu_core/cpu-cycles/ (77.31%)
63,886,523 cpu_core/instructions/ (84.87%)
0 cpu_core/mem-loads/ (77.31%)
12,208,782 cpu_core/mem-stores/ (77.31%)
566,409,738 cpu_core/ref-cycles/ (84.87%)
23,118 cpu_core/cache-misses/ (76.71%)
7,212,602 cpu_core/cache-references/ (76.29%)

1.003228667 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
---
tools/perf/arch/x86/util/evsel.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 00cb4466b4ca..24510bcb4bf4 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -33,8 +33,9 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)

bool arch_evsel__must_be_in_group(const struct evsel *evsel)
{
- if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
- !pmu_have_event("cpu", "slots"))
+ const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
+
+ if (!pmu_have_event(pmu_name, "slots"))

Playing devil's advocate, if I have a PMU for my network accelerator
and it has an event called "slots" then this test will also be true.


IIRC, the pmu_have_event should only check the event which is exposed by
the kernel. It's very unlikely that another PMU expose the exact same name.

If you still worry about it, I think we can check the PMU type
PERF_TYPE_RAW here, which is reserved for the core PMU. Others cannot
use it.

That's cool, this isn't documented behavior though:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/include/uapi/linux/perf_event.h?h=perf/core#n34
and PERF_TYPE_HARDWARE wouldn't seem a wholly unreasonable type. It
kind of feels like depending on a quirk, and so we should bury the
quirk in a helper function and document it :-)

The PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE are aliases for the PERF_TYPE_RAW. The PERF_TYPE_HARDWARE is used for the 10 common hardware events and The PERF_TYPE_HW_CACHE is used for the hardware cache events.
Other core events (not include the atom core events in a hybrid machine) should have the PERF_TYPE_RAW type.

Since the perf metrics is a big core only feature, checking both the PERF_TYPE_RAW type and the slots event should be good enough here.

I will add more comments in V2.


It looks like arch_evsel__must_be_in_group() is the only user for the
evsel__sys_has_perf_metrics() for now, so I make it static.

The other pmu_have_event("cpu", "slots") is in evlist.c.
topdown_sys_has_perf_metrics() in patch 4 should be used to replace it.
I think Zhengjun will post patches for the changes for the evlist.c

Ok, is Zhengjun putting his changes on top of this and fixing up the
APIs or is he waiting on these changes landing? Let me know how to
help. I'm guessing landing my changes is the first step.

Right. Your changes is the first step. Then this patch set. Zhengjun's will be on top of us.

Thanks,
Kan


Thanks,
Ian

diff --git a/tools/perf/arch/x86/util/evsel.c
b/tools/perf/arch/x86/util/evsel.c
index 24510bcb4bf4..a4714174e30f 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -31,11 +31,20 @@ void arch_evsel__fixup_new_cycles(struct
perf_event_attr *attr)
free(env.cpuid);
}

-bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+static bool evsel__sys_has_perf_metrics(const struct evsel *evsel)
{
const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";

- if (!pmu_have_event(pmu_name, "slots"))
+ if ((evsel->core.attr.type == PERF_TYPE_RAW) &&
+ pmu_have_event(pmu_name, "slots"))
+ return true;
+
+ return false;
+}
+
+bool arch_evsel__must_be_in_group(const struct evsel *evsel)
+{
+ if (!evsel__sys_has_perf_metrics(evsel))
return false;

return evsel->name &&

Thanks,
Kan

The property that is being tested here is "does this CPU have topdown
events" and so allowing any PMU removes the "does this CPU" part of
the equation. I think ideally we'd have an arch functions something
like:

bool arch_pmu__has_intel_topdown_events(void)
{
static bool has_topdown_events = pmu_have_event("cpu", "slots") ||
pmu_have_event("cpu_core", "slots");

return has_topdown_events;
}

bool arch_pmu__supports_intel_topdown_events(const char *pmu_name)
{
if (!pmu_name)
return false;
return arch_pmu__has_intel_topdown_events() && (!strncmp(pmu_name,
"cpu") || !strncmp(pmu_name, "cpu_core"));
}

bool arch_evsel__is_intel_topdown_event(struct evsel *evsel)
{
if (!arch_pmu__supports_intel_topdown_events(evsel->pmu))
return false;

return strcasestr(evsel->name, "slots") || strcasestr(evsel->name, "topdown");
}

This then gives us:

bool arch_evsel__must_be_in_group(const struct evsel *evsel)
{
return arch_evsel__is_intel_topdown_event(evsel);
}

These functions can then be reused for the arch_evlist topdown code,
etc. What I don't see in these functions is use of any hybrid
abstraction and so it isn't clear to me how with hybrid something like
this would be plumbed in.

Thanks,
Ian

return false;

return evsel->name &&
--
2.35.1