Re: [PATCH v1] perf evlist: Force adding default events only to core PMUs
From: Namhyung Kim
Date: Fri Jun 07 2024 - 02:10:27 EST
Hello,
On Thu, Jun 06, 2024 at 10:42:33AM +0100, James Clark wrote:
>
>
> On 06/06/2024 08:09, Namhyung Kim wrote:
> > On Wed, Jun 5, 2024 at 4:02 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >>
> >> On Wed, Jun 5, 2024 at 1:29 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >>>
> >>> On Thu, May 30, 2024 at 3:52 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
> >>>>
> >>>> On Thu, May 30, 2024 at 06:46:08AM -0700, Ian Rogers wrote:
> >>>>> On Thu, May 30, 2024 at 5:48 AM James Clark <james.clark@xxxxxxx> wrote:
> >>>>>>
> >>>>>> On 30/05/2024 06:35, Namhyung Kim wrote:
> >>>>>>> It might not be a perfect solution but it could be a simple one.
> >>>>>>> Ideally I think it'd be nice if the kernel exports more information
> >>>>>>> about the PMUs like sampling and exclude capabilities.
> >>>>>>>> Thanks,
> >>>>>>> Namhyung
> >>>>>>
> >>>>>> That seems like a much better suggestion. Especially with the ever
> >>>>>> expanding retry/fallback mechanism that can never really take into
> >>>>>> account every combination of event attributes that can fail.
> >>>>>
> >>>>> I think this approach can work but we may break PMUs.
> >>>>>
> >>>>> Rather than use `is_core` on `struct pmu` we could have say a
> >>>>> `supports_sampling` and we pass to parse_events an option to exclude
> >>>>> any PMU that doesn't have that flag. Now obviously more than just core
> >>>>> PMUs support sampling. All software PMUs, tracepoints, probes. We have
> >>>>> an imprecise list of these in perf_pmu__is_software. So we can set
> >>>>> supports_sampling for perf_pmu__is_software and is_core.
> >>>>
> >>>> Yep, we can do that if the kernel provides the info. But before that
> >>>> I think it's practical to skip uncore PMUs and hope other PMUs don't
> >>>> have event aliases clashing with the legacy names. :)
> >>>>
> >>>>>
> >>>>> I think the problem comes for things like the AMD IBS PMUs, intel_bts
> >>>>> and intel_pt. Often these only support sampling but aren't core. There
> >>>>> may be IBM S390 PMUs or other vendor PMUs that are similar. If we can
> >>>>> make a list of all these PMU names then we can use that to set
> >>>>> supports_sampling and not break event parsing for these PMUs.
> >>>>>
> >>>>> The name list sounds somewhat impractical, let's say we lazily compute
> >>>>> the supports_sampling on a PMU. We need the sampling equivalent of
> >>>>> is_event_supported:
> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n242
> >>>>> is_event_supported has had bugs, look at the exclude_guest workaround
> >>>>> for Apple PMUs. It also isn't clear to me how we choose the event
> >>>>> config that we're going to probe to determine whether sampling works.
> >>>>> The perf_event_open may reject the test because of a bad config and
> >>>>> not because sampling isn't supported.
> >>>>>
> >>>>> So I think we can make the approach work if we had either:
> >>>>> 1) a list of PMUs that support sampling,
> >>>>> 2) a reliable "is_sampling_supported" test.
> >>>>>
> >>>>> I'm not sure of the advantages of doing (2) rather than just creating
> >>>>> the set of evsels and ignoring those that fail to open. Ignoring
> >>>>> evsels that fail to open seems more unlikely to break anything as the
> >>>>> user is giving the events/config values for the PMUs they care about.
> >>>>
> >>>> Yep, that's also possible. I'm ok if you want to go that direction.
> >>>
> >>> Hmm.. I thought about this again. But it can be a problem if we ignore
> >>> any failures as it can be a real error due to other reason - e.g. not
> >>> supported configuration or other user mistakes.
> >>
> >> Right, we have two not good choices:
> >>
> >> 1) Try to detect whether sampling is supported, but any test doing
> >> this needs to guess at a configuration and we'll need to deflake this
> >> on off platforms like those that don't allow things like exclude
> >> guest.
> >
> > I believe we don't need to try so hard to detect if sampling is
> > supported or not. I hope we will eventually add that to the
> > kernel. Also this is just an additional defense line, it should
> > work without it in most cases. It'll just protect from a few edge
> > cases like uncore PMUs having events of legacy name. For
> > other events or PMUs, I think it's ok to fail.
> >
> >
> >> 2) Ignore failures, possibly hiding user errors.
> >>
> >> I would prefer for (2) the errors were pr_err rather than pr_debug,
> >> something the user can clean up by getting rid of warned about PMUs.
> >> This will avoid hiding the error, but then on Neoverse cycles will
> >> warn about the arm_dsu PMU's cycles event for exactly Linus' test
> >> case. My understanding is that this is deemed a regression, hence
> >> Arnaldo proposing pr_debug to hide it.
> >
> > Right, if we use pr_err() then users will complain. If we use
> > pr_debug() then errors will be hidden silently.
> >
> > Thanks,
> > Namhyung
>
> I'm not sure if anyone would really complain about warnings for
> attempting to open but not succeeding, as long as the event that they
> really wanted is there. I'm imagining output like this:
>
> $ perf record -e cycles -- ls
>
> Warning: skipped arm_dsu/cycles/ event(s), recording on
> armv8_pmuv3_0/cycles/, armv8_pmuv3_1/cycles/
This looks good, but I guess arm_dsu (or others maybe..) has multiple
instances like arm_dsu_0, arm_dsu_1, and so on. Then it should merge
the similar PMUs and print once. Same thing for armv8_pmuv3.
But I think it's better to skip the events if we know the PMU doesn't
support sampling for sure.
>
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.008 MB perf.data (30 samples) ]
>
> You only really need to worry when no events can be opened, but
> presumably that was a warning anyway.
Right, this is a problem but I'm not sure it handles the case
specifically as it just reported warning on any failures first.
Thanks,
Namhyung
>
> And in stat mode I wouldn't expect any warnings.