Re: [PATCH] perf stat: Merge CPU maps when merging uncore aliases
From: Chun-Tse Shao
Date: Fri May 01 2026 - 14:19:33 EST
My fault, what I mean was "tweak should_skip_zero_counter"
On Fri, May 1, 2026 at 11:17 AM Chun-Tse Shao <ctshao@xxxxxxxxxx> wrote:
>
> Instead, I submitted another fix to tweak `Merge CPU maps when merging
> uncore aliases`. Please take a look
> lore.kernel.org/20260501181129.938256-1-ctshao@xxxxxxxxxx
>
> Thanks,
> CT
>
>
> On Thu, Apr 16, 2026 at 2:05 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >
> > On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@xxxxxxxxxx> wrote:
> > >
> > > Sockets with event zero counts are hidden in `perf stat --per-socket`
> > > because alias merging aggregates counts but not CPU maps. This causes
> > > the display logic to incorrectly skip sockets not present in the leader
> > > instance's map.
> > >
> > > For example, with AMD_UMC:
> > > $ cat /sys/bus/event_source/devices/amd_umc*/cpumask
> > > 0
> > > ...
> > > 128
> > > ...
> > >
> > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > > S0 12 0 amd_umc/config=0xff/
> > >
> > > 1.000777829 seconds time elapsed
> > >
> > > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
> > > event correctly reflects all underlying PMU instances.
> > >
> > > After fix:
> > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > > S0 12 0 amd_umc/config=0xff/
> > > S1 12 0 amd_umc/config=0xff/
> > >
> > > 1.000666681 seconds time elapsed
> > >
> > > Signed-off-by: Chun-Tse Shao <ctshao@xxxxxxxxxx>
> > > ---
> > > tools/perf/util/stat.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> > > index 14d169e22e8f..d696d217f98d 100644
> > > --- a/tools/perf/util/stat.c
> > > +++ b/tools/perf/util/stat.c
> > > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
> > > aggr_counts_a->run += aggr_counts_b->run;
> > > }
> > >
> > > + /*
> > > + * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
> > > + * knows this merged event covers all CPUs from both aliases.
> > > + */
> > > + perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
> > > +
> >
> > Once we have counts, mutating the cpumaps seems wrong; it could impact
> > other things such as the evlist copies of unioned together cpumaps.
> > I'm confused by the bug, we should have two aggregate ids one for each
> > socket. In should_skip_zero_counter it should only skip the zero count
> > if the counter is not applicable for the counter currently trying to
> > be printed. Does the display code fail to iterate through all the
> > counters?
> >
> > Thanks,
> > Ian
> >
> > > return 0;
> > > }
> > >
> > > --
> > > 2.54.0.rc1.555.g9c883467ad-goog
> > >