Re: [PATCH v3 0/2] perf stat: Fix uncore metric scaling across aggregation modes

From: Arnaldo Carvalho de Melo

Date: Thu Jun 04 2026 - 10:08:21 EST


On Thu, May 28, 2026 at 12:17:57PM -0700, Namhyung Kim wrote:
> On Thu, May 21, 2026 at 01:15:03PM -0700, Chun-Tse Shao wrote:
> > This series fixes a scaling issue for metrics (like lpm_miss_lat) across
> > different runtime aggregation modes.
> >
> > Uncore metrics currently use `source_count` to scale events. However,
> > `source_count` returns the total uncore unit count regardless of the
> > selected aggregation mode. When evaluating metrics in different
> > aggregation mode other than `--per-socket`, this incorrectly divides
> > aggregated uncore events against the total uncore count rather than the
> > uncores belonging to the aggregation, leading to wrong metric results.
> >
> > To fix this, we:
> > 1. Introduce the aggr_nr() keyword to the metric parser, which
> > dynamically resolves to the active units in the current aggregation
> > group (`gr->nr`).
> >
> > 2. Update the python metrics to use `aggr_nr` instead of `source_count`,
> > ensuring correct scaling across all runtime aggregation boundaries.
> >
> > Before the fix (incorrect low latency in global mode):
> > $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
> > {"ns lpm_miss_lat_rem" : "122.8", "ns lpm_miss_lat_loc" : "114.5"}
> > $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
> > {"socket" : "S0", "ns lpm_miss_lat_rem" : "232.1", "ns lpm_miss_lat_loc" : "278.2"}
> > {"socket" : "S1", "ns lpm_miss_lat_rem" : "233.9", "ns lpm_miss_lat_loc" : "257.5"}
> >
> > After the fix (correct scaled latency in all aggregation modes):
> > $ perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
> > {"ns lpm_miss_lat_rem" : "231.7", "ns lpm_miss_lat_loc" : "245.0"}
> > $ perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
> > {"socket" : "S0", "ns lpm_miss_lat_rem" : "238.3", "ns lpm_miss_lat_loc" : "249.4"}
> > {"socket" : "S1", "ns lpm_miss_lat_rem" : "259.1", "ns lpm_miss_lat_loc" : "253.1"}
> >
> > v3:
> > Fixed based on Sashiko review:
> > - Removed the unnecessary, copied `redefined-builtin` pylint-disable
> > comment from `aggr_nr` definition inside `metric.py`.
> >
> > v2: lore.kernel.org/20260521035941.3860145-1-ctshao@xxxxxxxxxx
> > Fixed based on Sashiko review:
> > - Fixed `aggr_nr` setting when an uncore event fails to run
> > (counts.run == 0) to explicitly set it to 0 instead of defaulting to
> > 1.
> > - Accumulated `aggr_nr` when multiple unmerged PMU events are
> > associated with the same metric ID to prevent incorrect scaling
> > across active sockets.
> > - Removed unused `List` import from `typing` in `intel_metrics.py`.
> >
> > v1: lore.kernel.org/20260520180032.3045144-1-ctshao@xxxxxxxxxx
> >
> > Chun-Tse Shao (2):
> > perf stat: Add aggr_nr metric parser support
> > perf stat: Use aggr_nr scaling for Intel uncore miss latency metrics
>
> Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>

Thanks, applied to perf-tools-next, for v7.2.

- Arnaldo