Re: [PATCH v2 2/2] perf evsel: Don't configure framepointer callchains on s390

From: Thomas Richter

Date: Fri Mar 13 2026 - 05:47:23 EST


On 3/12/26 17:46, Ian Rogers wrote:
> On Thu, Mar 12, 2026 at 8:54 AM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>>
>> On Thu, Mar 12, 2026 at 5:45 AM Thomas Richter <tmricht@xxxxxxxxxxxxx> wrote:
>>>
>>> On 3/12/26 07:16, Ian Rogers wrote:
>>>> Frame pointer callchains are not supported on s390. Ignore the option
>>>> and print a warning.
>>>>
>>>> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
>>>> ---
>>>> v2: Only disable user callchains as AI is telling me native "kernel"
>>>> callchains are supported on s390.
>>>> ---
>>>> tools/perf/util/evsel.c | 6 ++++++
>>>> 1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>>>> index bd14d9bbc91f..fa21b48cba86 100644
>>>> --- a/tools/perf/util/evsel.c
>>>> +++ b/tools/perf/util/evsel.c
>>>> @@ -1076,6 +1076,12 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
>>>> attr->exclude_callchain_user = 1;
>>>> }
>>>>
>>>> + if (EM_HOST == EM_S390 && (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
>>>> + !evsel->core.attr.exclude_callchain_user) {
>>>> + pr_warning("Excluding user callchains that are not supported on s390. Try '--call-graph dwarf'\n");
>>>> + evsel->core.attr.exclude_callchain_user = 1;
>>>> + }
>>>> +
>>>> if (param->defer && !attr->exclude_callchain_user)
>>>> attr->defer_callchain = 1;
>>>> }
>>>
>>> Ian, thanks very much.
>>> Your patch set helps a lot. However there is a small nit (which is mandatory). Please add these lines
>>>
>>> evsel->core.attr.sample_type &= ~PERF_SAMPLE_CALLCHAIN;
>>> evsel->core.attr.sample_type &= ~PERF_SAMPLE_REGS_USER;
>>> evsel->core.attr.sample_type &= ~PERF_SAMPLE_STACK_USER;
>>
>> So these lines are dropping callchain from the sample_type which means
>> the kernel stack won't be sampled. AI was telling me this worked, but
>> I'm guess it was wrong. I think rather than this it is just cleaner
>> never to set the bits in the perf_event_attr, more like what v1 of the
>> patch did:
>> https://lore.kernel.org/lkml/20260312031928.1494864-3-irogers@xxxxxxxxxx/
>>

Let me try to answer your and Namhyung questions in one reply.
s390 has many PMUs, there is one for counting (cpum_cf) and one for sampling (cpum_sf),
which are totally different hardware components.
Hardware sampling
1. writes into a large buffer, data is timestamp and IP (instruction pointer), no registers, no PID.
2. when buffer gets full an interrupt occurs and the interrupt handler scans the
hardware samples and converts valid samples to perf format and saves them in perf ring buffer.
This means we can not reconstruct register values for an individual sample. Current Pid is saved by
other means somewhere and added to perf data during perf ring buffer write.

For s390 when we need callchains, we always use event cpu-clock, never hardware event cycles.


>>> to the new if(EM_HOST == ...) above.
>>> The s390 CPU Measurement sampling device driver does not check on the attr.core.exclude_callchain_user
>>> member, but on the sample_type bit mask. It returns -EOPNOTSUPP when this bit PERF_SAMPLE_CALLCHAIN
>>> is set. This solves the invocation with command line flag -g as in
>>> # ./perf record -v -e cycles -g -- perf test -w noploop
>>> ...
>>> perf record: Captured and wrote 0.183 MB perf.data ]
>>
>> Right because the callchain was removed from all the samples. We can't
>> fix old kernels (other than by using fix tags); is there a possibility
>> of adding the exclude_callchain_user to the s390 perf driver for the
>> sake of kernel callchains? It seems better than providing no
>> callchain.
>>
>>> Also I discovered that the fallback when using --call-graph dwarf command line flag still fails:
>>> # ./perf record -v -e cycles --call-graph dwarf -- perf test -w noploop
>>> ...
>>> Warning:
>>> Trying to fall back to excluding guest samples
>>> Error:
>>> Failure to open event 'cycles:H' on PMU 'cpum_cf' which will be removed.
>>> cycles:H: PMU Hardware doesn't support sampling overflow-interrupts. Try 'perf stat'
>>> Error:
>>> Failure to open any events for recording.
>>>
>>> The reason is in __evsel__config_callchain() which calls evsel__set_sample_bit(evsel, CALLCHAIN)
>>> and sets the PERF_SAMPLE_CALLCHAIN bit in evsel->core.attr.sample_type. It also sets the
>>> member attr->exclude_callchain_user = 1 and sets bits REGS_USER and _STACK_USER.
>>> All three bits are not supported by s390.
>>
>> I'm confused by this and your previous testing that showed the
>> `--call-graph dwarf` worked. You need the sampled registers for dwarf
>> unwinding to provide initial register values for the unwinder.
>>

Right, as explained shortly above, we have to use '-e cpu-clock --call-graph dwarf'
to get valid call chain data.

>>> I have modified your 2nd patch and appended it.
>>>
>>> I find all these bits in sample_type and the attr.exclude_XXX stuff very confusing. If there
>>> is a more consistant way of checking these feature, please let me know.
>
> I forgot to mention, yeah the exclude thing is maddening. It takes
> about 100 lines to convert the command line modifiers to those in the
> perf_event_attr, there's a priority to them, and so on:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/parse-events.c#n1759
> Fwiw, I was thinking for patch 1 of holding onto the parsed modifiers
> so that we could reset the excludes based on them when switching to
> the software event.
>
>> Ok, let me check it out.
>
> So looking at the cpum_cf driver it fails events for having sampling enabled:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/s390/kernel/perf_cpum_cf.c#n859
> ```
> if (is_sampling_event(event)) /* No sampling support */
> ```
> and the cpum_sf driver fails for any kind of callchain:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/s390/kernel/perf_cpum_sf.c#n839
> ```
> static bool is_callchain_event(struct perf_event *event)
> {
> u64 sample_type = event->attr.sample_type;
>
> return sample_type & (PERF_SAMPLE_CALLCHAIN | PERF_SAMPLE_REGS_USER |
> PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_STACK_USER);
> }
>
> static int cpumsf_pmu_event_init(struct perf_event *event)
> {
> int err;
>
> /* No support for taken branch sampling */
> /* No support for callchain, stacks and registers */
> if (has_branch_stack(event) || is_callchain_event(event))
> return -EOPNOTSUPP;
> ```
> Perhaps there is an oversight in the cpum_cf driver wrt branch stacks
> (LBR on x86). The PERF_SAMPLE_CALLCHAIN bit is set for perf call-graph
> options currently:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/evsel.c#n1024
>
> I think cpum_sf is the PMU we care about for the default cycles event?

Right this PMU does sampling on s390

> Since callchains of any flavor don't work with cpum_sf we can do a few
> things:
> 1) Disable the callchain and allow hardware sampling to continue,
> 2) Switch to a software event like cpu-clock,
> 3) Fail for the callchain option with this PMU, which is currently
> happening anyway.

@Namhyung:
It fails to open event cycles for PMU cpum_sf.

>
> I dislike option 3 because it requires special s390 logic for many
> tests, and we lose testing coverage of hardware events. Option 1 is a
> smaller patch, an early return in __evsel__config_callchain if on
> s390. Option 2 feels most like what the user would want given they
> asked for a callchain. We could change evlist__new_default to take a
> "with callchain" boolean and on s390 that could switch the event to a
> software one. Only two non-test callers use evlist__new__default
> (record and top), so it isn't a huge change.
>
> Wdyt Thomas, option 1 or 2?
>
> Thanks,
> Ian

I actually prefer option 2. when call chains are requested by the user
switch to cpu-clock event.

Thanks a lot for your patience.

--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294