Re: [PATCH v5 4/4] perf parse-events: Reapply "Prefer sysfs/JSON hardware events over legacy"

From: Ian Rogers
Date: Fri Jan 10 2025 - 17:15:43 EST


On Fri, Jan 10, 2025 at 11:40 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Thu, Jan 09, 2025 at 02:21:09PM -0800, Ian Rogers wrote:
> > Originally posted and merged from:
> > https://lore.kernel.org/r/20240416061533.921723-10-irogers@xxxxxxxxxx
> > This reverts commit 4f1b067359ac8364cdb7f9fda41085fa85789d0f although
> > the patch is now smaller due to related fixes being applied in commit
> > 22a4db3c3603 ("perf evsel: Add alternate_hw_config and use in
> > evsel__match").
> > The original commit message was:
> >
> > It was requested that RISC-V be able to add events to the perf tool so
> > the PMU driver didn't need to map legacy events to config encodings:
> > https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@xxxxxxxxxxxx/
> >
> > This change makes the priority of events specified without a PMU the
> > same as those specified with a PMU, namely sysfs and JSON events are
> > checked first before using the legacy encoding.
>
> I'm still not convinced why we need this change despite of these
> troubles. If it's because RISC-V cannot define the lagacy hardware
> events in the kernel driver, why not using a different name in JSON and
> ask users to use the name specifically? Something like:
>
> $ perf record -e riscv-cycles ...

So ARM and RISC-V are more than able to speak for themselves and have
their tags on the series, but let's recap why I'm motivated to do this
change:

1) perf supported legacy events;
2) perf supported sysfs and json events, but at a lower priority than
legacy events;
3) hybrid support was added but in a way where all the hybrid PMUs
needed to be known, assumptions about PMU were implicit and baked into
the tool;
4) metric support for hybrid was going in a similar implicit direction
and I objected, what would cycles mean in a metric if the core PMU was
implicit? Rather than pursue this the hybrid code was overhauled, PMUs
became more of a thing and we added a notion of a "core" PMU which
would support legacy events;
5) ARM core PMUs differ in naming, etc. than just about every other
platform. Their core events had been being programmed as if they were
uncore events - ie without the legacy priority. Fixing hybrid, and
fixing ARM PMUs to know they supported legacy events, broke perf on
Apple-M? series due to a PMU driver issue with legacy events:
https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@xxxxxxxxx/
"Perf broke on all Apple ARM64 systems (tested almost everything), and
according to maz also on Juno (so, probably all big.LITTLE) since
v6.5."
6) sysfs/json events were made the priority over legacy to unbreak
perf on Apple-M? CPUs, but only if the PMU is specified:
https://lore.kernel.org/r/20231123042922.834425-1-irogers@xxxxxxxxxx
Reported-by: Hector Martin <marcan@xxxxxxxxx>
Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
Tested-by: Hector Martin <marcan@xxxxxxxxx>
Tested-by: Marc Zyngier <maz@xxxxxxxxxx>
Acked-by: Mark Rutland <mark.rutland@xxxxxxx>

This gets us to the current code where I can trivially get an
inconsistency. Here on Intel with no PMU in the event name:
```
$ perf stat -vv -e cpu-cycles true
Using CPUID GenuineIntel-6-8D-1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 752915 cpu -1 group_fd -1 flags 0x8 = 3
cpu-cycles: -1: 1293076 273429 273429
cpu-cycles: 1293076 273429 273429

Performance counter stats for 'true':

1,293,076 cpu-cycles

0.000809752 seconds time elapsed

0.000841000 seconds user
0.000000000 seconds sys
```

Here with a PMU event name:
```
$ sudo perf stat -vv -e cpu/cpu-cycles/ true
Using CPUID GenuineIntel-6-8D-1
Attempt to add: cpu/cpu-cycles=0/
..after resolving event: cpu/event=0x3c/
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 4 (cpu)
size 136
config 0x3c (cpu-cycles)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 752839 cpu -1 group_fd -1 flags 0x8 = 3
cpu/cpu-cycles/: -1: 1421235 531150 531150
cpu/cpu-cycles/: 1421235 531150 531150

Performance counter stats for 'true':

1,421,235 cpu/cpu-cycles/

0.001292908 seconds time elapsed

0.001340000 seconds user
0.000000000 seconds sys
```

That is the no PMU event is opened as type=0/config=0 (legacy) while
the PMU event is opened as type=4/config=0x3c (sysfs encoding). Now
let's cross our fingers and hope that in the driver they are really
the same thing. I take objection to the idea that there should be two
different priorities for sysfs/json and legacy depending on whether a
PMU is or isn't specified in the event name. The priority could be
legacy then sysfs/json, or it could be sysfs/json then legacy, but it
should be the same regardless of whether the PMU is put in the event
name. The PMU in the event name should be optional, for example we may
or may not show it in the stat output. The encoding being consistent
was the case prior to the Apple-M? fix and this patch aims to make it
consistent once more. Given the ARM bug mentioned above it should also
fix specifying or not the PMU on Apple-M? CPUs as it will avoid the
same legacy event issue that may only exist on old kernels. RISC-V is
motivated because of not wanting hard coded legacy events in the
driver for all potential vendors and models.

Thanks,
Ian