Re: [PATCH v2] perf stat: Fix crash on arm64
From: Leo Yan
Date: Fri Mar 27 2026 - 14:06:23 EST
On Thu, Mar 26, 2026 at 02:21:00PM -0700, Ian Rogers wrote:
[...]
> > On one of my board, I can see the log:
> >
> > Events in 'frontend_bound' fully contained within 'retiring'
> > Events in 'bad_speculation' fully contained within 'retiring'
> > Events in 'backend_bound' fully contained within 'retiring'
>
> I looked at nvidia/t410/metrics.json but I'm not clear on where the
> issue is coming from.
Digging a bit, all these metrics have syntax errors because #slots is
zero when returned from tool_pmu__cpu_slots_per_cycle().
See the log: https://termbin.com/e9ue
After that, executes parse_groups() treats all these problematic metrics
as being contained by "retiring".
I have sent a patch to make __add_metrics() respect the returned syntax
error, so the program can exit early.
> > In parse_groups(), when find a event is fully contained by a previous
> > event, it skips to call parse_ids(), as a result, m->pctx->ids is not
> > initialized. Then, setup_metric_events() returns an empty metric
> > events, pick_display_evsel() consumes the returned metric_events and
> > feeds to metricgroup__lookup() with passing "evsel = NULL".
>
> Fully contained groups exist on x86, why isn't this problem breaking
> whenever this happens? Stepping through "contained" examples, I see
> that the ids aren't initialized. I think something else must be
> happening.
If you change tool_pmu__cpu_slots_per_cycle() to always return zero,
this might can reproduce the issue.
[...]
> > @@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
> > if (expr__subset_of_ids(n->pctx, m->pctx)) {
> > pr_debug("Events in '%s' fully contained within '%s'\n",
> > m->metric_name, n->metric_name);
> > - metric_evlist = n->evlist;
> > + contained = n->evlist;
> > break;
> > }
> > -
> > }
> > }
> > if (!metric_evlist) {
> > + metric_evlist = contained ? contained : m->evlist;
> > +
> > ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
> > - m->group_events, tool_events, &m->evlist);
> > + m->group_events, tool_events, &metric_evlist);
>
> Won't this match the behavior of metric_no_merge/--metric-no-merge,
> since for every metric the events for that metric are being appended
> to the evlist?
TBH, I still cannot understand well parse_groups().
The change above passes &metric_evlist to parse_ids(), as this will only
change the value of metric_evlist itself but not update m->evlist or
n->evlist, this is not right for metric_no_merge case ?
Here I am not sure how to use parse_ids() to parse "m->pctx" but avoid
to overwrite "n->evlist" for the fully contained case.
Thanks,
Leo