Re: [PATCH v1] perf evlist: Force adding default events only to core PMUs

From: James Clark
Date: Thu Jun 06 2024 - 05:43:21 EST




On 06/06/2024 08:09, Namhyung Kim wrote:
> On Wed, Jun 5, 2024 at 4:02 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>>
>> On Wed, Jun 5, 2024 at 1:29 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>>>
>>> On Thu, May 30, 2024 at 3:52 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>>>>
>>>> On Thu, May 30, 2024 at 06:46:08AM -0700, Ian Rogers wrote:
>>>>> On Thu, May 30, 2024 at 5:48 AM James Clark <james.clark@xxxxxxx> wrote:
>>>>>>
>>>>>> On 30/05/2024 06:35, Namhyung Kim wrote:
>>>>>>> It might not be a perfect solution but it could be a simple one.
>>>>>>> Ideally I think it'd be nice if the kernel exports more information
>>>>>>> about the PMUs like sampling and exclude capabilities.
>>>>>>>> Thanks,
>>>>>>> Namhyung
>>>>>>
>>>>>> That seems like a much better suggestion. Especially with the ever
>>>>>> expanding retry/fallback mechanism that can never really take into
>>>>>> account every combination of event attributes that can fail.
>>>>>
>>>>> I think this approach can work but we may break PMUs.
>>>>>
>>>>> Rather than use `is_core` on `struct pmu` we could have say a
>>>>> `supports_sampling` and we pass to parse_events an option to exclude
>>>>> any PMU that doesn't have that flag. Now obviously more than just core
>>>>> PMUs support sampling. All software PMUs, tracepoints, probes. We have
>>>>> an imprecise list of these in perf_pmu__is_software. So we can set
>>>>> supports_sampling for perf_pmu__is_software and is_core.
>>>>
>>>> Yep, we can do that if the kernel provides the info. But before that
>>>> I think it's practical to skip uncore PMUs and hope other PMUs don't
>>>> have event aliases clashing with the legacy names. :)
>>>>
>>>>>
>>>>> I think the problem comes for things like the AMD IBS PMUs, intel_bts
>>>>> and intel_pt. Often these only support sampling but aren't core. There
>>>>> may be IBM S390 PMUs or other vendor PMUs that are similar. If we can
>>>>> make a list of all these PMU names then we can use that to set
>>>>> supports_sampling and not break event parsing for these PMUs.
>>>>>
>>>>> The name list sounds somewhat impractical, let's say we lazily compute
>>>>> the supports_sampling on a PMU. We need the sampling equivalent of
>>>>> is_event_supported:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print-events.c?h=perf-tools-next#n242
>>>>> is_event_supported has had bugs, look at the exclude_guest workaround
>>>>> for Apple PMUs. It also isn't clear to me how we choose the event
>>>>> config that we're going to probe to determine whether sampling works.
>>>>> The perf_event_open may reject the test because of a bad config and
>>>>> not because sampling isn't supported.
>>>>>
>>>>> So I think we can make the approach work if we had either:
>>>>> 1) a list of PMUs that support sampling,
>>>>> 2) a reliable "is_sampling_supported" test.
>>>>>
>>>>> I'm not sure of the advantages of doing (2) rather than just creating
>>>>> the set of evsels and ignoring those that fail to open. Ignoring
>>>>> evsels that fail to open seems more unlikely to break anything as the
>>>>> user is giving the events/config values for the PMUs they care about.
>>>>
>>>> Yep, that's also possible. I'm ok if you want to go that direction.
>>>
>>> Hmm.. I thought about this again. But it can be a problem if we ignore
>>> any failures as it can be a real error due to other reason - e.g. not
>>> supported configuration or other user mistakes.
>>
>> Right, we have two not good choices:
>>
>> 1) Try to detect whether sampling is supported, but any test doing
>> this needs to guess at a configuration and we'll need to deflake this
>> on off platforms like those that don't allow things like exclude
>> guest.
>
> I believe we don't need to try so hard to detect if sampling is
> supported or not. I hope we will eventually add that to the
> kernel. Also this is just an additional defense line, it should
> work without it in most cases. It'll just protect from a few edge
> cases like uncore PMUs having events of legacy name. For
> other events or PMUs, I think it's ok to fail.
>
>
>> 2) Ignore failures, possibly hiding user errors.
>>
>> I would prefer for (2) the errors were pr_err rather than pr_debug,
>> something the user can clean up by getting rid of warned about PMUs.
>> This will avoid hiding the error, but then on Neoverse cycles will
>> warn about the arm_dsu PMU's cycles event for exactly Linus' test
>> case. My understanding is that this is deemed a regression, hence
>> Arnaldo proposing pr_debug to hide it.
>
> Right, if we use pr_err() then users will complain. If we use
> pr_debug() then errors will be hidden silently.
>
> Thanks,
> Namhyung

I'm not sure if anyone would really complain about warnings for
attempting to open but not succeeding, as long as the event that they
really wanted is there. I'm imagining output like this:

$ perf record -e cycles -- ls

Warning: skipped arm_dsu/cycles/ event(s), recording on
armv8_pmuv3_0/cycles/, armv8_pmuv3_1/cycles/

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (30 samples) ]

You only really need to worry when no events can be opened, but
presumably that was a warning anyway.

And in stat mode I wouldn't expect any warnings.