Re: [PATCH] perf stat: Check aliases in should_skip_zero_counter
From: Chun-Tse Shao
Date: Fri May 01 2026 - 17:42:12 EST
Hi Ian, thanks for your review.
On Fri, May 1, 2026 at 12:12 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Fri, May 1, 2026 at 11:11 AM Chun-Tse Shao <ctshao@xxxxxxxxxx> wrote:
> >
> > Aggregation IDs (e.g. sockets, caches) with event zero counts are hidden
> > in aggregated modes (like `--per-socket`) because alias merging
> > aggregates counts but not CPU maps. This causes display logic to skip
> > aggregation IDs not present in the leader's map.
> >
> > Instead of mutating CPU maps in `evsel__merge_aggr_counters` (which
> > could have side effects on shared CPU maps), check aliases in
> > `should_skip_zero_counter` to ensure all applicable aggregation IDs are
> > displayed.
> >
> > Before the fix:
> > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > S0 12 0 amd_umc/config=0xff/
> >
> > 1.000721604 seconds time elapsed
> >
> > After the fix:
> > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > S0 12 0 amd_umc/config=0xff/
> > S1 12 0 amd_umc/config=0xff/
> >
> > 1.000879376 seconds time elapsed
>
> Does this change the behavior without --per-socket when the uncore
> PMUs have different cpumasks?
Nope, the behavior is unchanged for `-a`, since counts aggregate to
the leader's `aggr[0]` and all CPUs's `aggr_id` is 0 in AGGR_GLOBAL
mode.
>
> > `perf test` looks good too.
> >
> > Assisted-by: Gemini:gemini-3.1-pro-preview
> > Signed-off-by: Chun-Tse Shao <ctshao@xxxxxxxxxx>
> > ---
> > tools/perf/util/stat-display.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > index 993f4c4b8f44..c12a3405118a 100644
> > --- a/tools/perf/util/stat-display.c
> > +++ b/tools/perf/util/stat-display.c
> > @@ -945,6 +945,27 @@ static bool should_skip_zero_counter(struct perf_stat_config *config,
> > if (aggr_cpu_id__equal(id, &own_id))
> > return false;
> > }
> > +
> > + /*
> > + * If the counter is a leader in alias merging, check if the aggr id
> > + * matches any of its aliases.
>
> Could you expand why in this case we want the zero count?
We don't want to hide counts if it is not in leader's cpumask even
though it is 0. It would cause confusion to users/parsers.
In my scenario, I tested an event that was all zeros on S1 and was
hidden by perf. Initially, I thought perf wasn't collecting the event
on S1, then realized collecting was working but hidden.
>
> > + */
> > + if (config->aggr_mode != AGGR_NONE && counter->first_wildcard_match == NULL) {
> > + struct evsel *alias;
> > +
> > + evlist__for_each_entry(counter->evlist, alias) {
> > + if (alias->first_wildcard_match == counter) {
>
> Could this also have a "&& !perf_cpu_map__equal(counter->core.cpu,
> alias->core.cpus)" ? ie if the alias' cpumask matches that of the
> counter, then the loop above covers it. Perhaps the cpumaps of each
> alias could be merged and the test only run if their intersection is
> not empty.
Thanks for the suggestion, I will add it in the next patch.
>
> > + perf_cpu_map__for_each_cpu(cpu, idx, alias->core.cpus) {
> > + struct aggr_cpu_id own_id =
> > + config->aggr_get_id(config, cpu);
>
> Could this fit on one line?
Then it will make the line 101 characters long...
>
> Thanks,
> Ian
>
> > +
> > + if (aggr_cpu_id__equal(id, &own_id))
> > + return false;
> > + }
> > + }
> > + }
> > + }
> > +
> > return true;
> > }
> >
> > --
> > 2.54.0.545.g6539524ca2-goog
> >