Re: [PATCH v5 4/4] perf parse-events: Reapply "Prefer sysfs/JSON hardware events over legacy"
From: Namhyung Kim
Date: Thu Feb 06 2025 - 00:09:11 EST
On Tue, Feb 04, 2025 at 08:48:20PM -0800, Ian Rogers wrote:
> On Tue, Feb 4, 2025 at 5:58 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >
> > On Mon, Feb 03, 2025 at 04:41:11PM -0800, Ian Rogers wrote:
> > > On Mon, Feb 3, 2025 at 4:15 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> > > [snip]
> > > > Yep, I agree it's confusing. So my opinion is to use legacy encoding
> > > > and no default wildcard. :)
> > >
> > > Making it so that all non-legacy, non-core PMU events require a PMU is
> > > a breaking change and a regression for all users, command line event
> > > name suggesting, any tool built off of perf, and so on. Breaking all
> > > perf users and requiring all perf metrics be rewritten is well..
> > > something..
> >
> > Well, I guess the majority of users don't use non-core PMU events. And
> > we used to have PMU prefix on those events for years so old users should
> > not be affected. Actually perf list shows them with PMU prefix so I
> > think new users are also expected to use the PMU name.
> >
> > $ perf list pmu
> > ...
> > cstate_pkg/c2-residency/ [Kernel PMU event]
> > ...
> > i915/actual-frequency/ [Kernel PMU event]
> > i915/bcs0-busy/ [Kernel PMU event]
> > ...
> > msr/tsc/ [Kernel PMU event]
> > ...
> > power/energy-cores/ [Kernel PMU event]
> > ...
> > uncore_clock/clockticks/ [Kernel PMU event]
> > uncore_imc_free_running/data_read/ [Kernel PMU event]
> > ...
> >
> > The exception is the JSON events like below.
> >
> > uncore interconnect:
> > unc_arb_coh_trk_requests.all
> > [UNC_ARB_COH_TRK_REQUESTS.ALL. Unit: uncore_arb]
> >
> > which I hoped to be 'uncore_arb/unc_arb_coh_trk_requests.all/' or even
> > 'uncore_arb/coh_trk_requests.all/'. But it would be hard to change the
> > all metric expressions now. Also users can directly use them as they
> > are listed by `perf list`. So we need to support that without PMUs.
>
> So there's nothing wrong with your proposal except it breaks non-core
> events. We can't agree to flip the default on a flag for perf top:
> https://lore.kernel.org/lkml/20240516222159.3710131-1-irogers@xxxxxxxxxx/
> to make perf top behave as, you know, top does as it could be an
> option people depend on. A behavior that matters if you do user
> filtering as exited processes stay in perf top (both confusing and
> un-top like). Fwiw, that reminds me of another patch series being
> unreviewed:
> https://lore.kernel.org/lkml/20250111190143.1029906-1-irogers@xxxxxxxxxx/
Ok, I'll review that later. Sorry my review bandwidth is not very high.
> Anyway, the perf top flag is one that no-one knows exists on a command
> most people don't know exists - Julia Evans' zine of course loves it
> and we love Julia's work and the zine.
You mean the -z flag which is documented in the man page and also it the
help message (perf top -h). Anyone can read the doc can know it's
there. Of course, people would prefer reading zines than man pages. :)
> So, it would seem to me that
> changing something as fundamental as how all non-core events behave
> would be seen as a regression.
Yep, it'd be a regression. And that's why we cannot simply change the
behavior. But I guess not much users would be affected by that since
it's undocumented behavior.
> Imagine the person going to
> perfmon-events.intel.com, finding an event name and expecting to be
> able to use it with perf. Now they need to grub around in perf list to
> locate the PMU. What is appropriate for them to know about how
> suffixes work and show in perf list..? Well that's assuming suffixes
> work in the future as ARM will probably launch an a1000 CPU and the
> PMU will look like a hex suffix and the whole naming convention
> implodes.
Which suffix do you mean?
Anyway, the person looked up the intel webpage would be eager to learn
about performance related things. Can we also assume if they also want
to learn about the perf tool itself? :)
If it's not the case, we have this:
$ perf record -e xxx
event syntax error: 'xxx'
\___ Bad event name
Unable to find event on a PMU of 'xxx'
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
So it says twice to run 'perf list' to see the events. Then they can
run either:
$ perf list | grep xxx
or
$ perf list xxx
to see the actual name of the event available in the perf tool.
>
> Even with this what would be the behavior of core events? You want
> legacy events to have priority over sysfs/json when there is no PMU.
> You know, and have stated not caring, RISC-V wants different and that
> it breaks Apple-M's PMUs for a fairly large range of kernel releases
> including 1 LTS kernel - the only reason I'm writing patches in this
> area in the 1st place. Software is soft and you can go fix software
> anywhere in the stack. Listening to vendors and not breaking everyone
> is the point-of-view these patches have been coming from. I find it
> very hard to have a conversation where this is just forgotten about
> and we're working on hypotheticals which seem to be both unwanted and
> implausible.
Sorry I don't want to repeat that too. Correct me if I'm wrong:
1. RISC-V is working on a solution with the current status and it's not
absoluted needed to change the current behavior.
2. Apple-M is fixed already.
>
> I don't know why people (yourself, Linus) keep wanting to show me the
> perf list output. It is arbitrary. I rewrote it and changed the
> behavior of all uncore PMUs within it (we didn't used to deduplicate
> based on the PMU suffix). It is nice that people think it reads like
> some religious text.
I think it's what we want users to know how to use the events.
> Why is the formatting different in perf list for
> json specified events? Well it is because json events have
> descriptions and the events you are showing with a PMU don't have a
> description. I think because there is no description, an effort was
> made to keep the output compact and put the PMU and event name
> together. It wasn't trying to enter some kind of long lasting marriage
> that the event name should only ever be used with the PMU.
I like the description but I don't like the formatting. I think I
understand why it looks like that but it could be different. Anyway,
I don't think showing PMU name is related to having descriptions.
> What happens if an event is both in sysfs and json? Well the sysfs event
> will get the description from the json and then I believe it won't
> behave as you show. Did the event get broken, as perf list no longer
> shows it with a PMU, by having a json description written? I think not
> and I think having descriptions with events is a good thing.
That's bad. Probably we should fix it takes only one of the sources and
change the JSON event not to clash with sysfs.
Thanks,
Namhyung