Re: [PATCH v4 2/4] perf stat: Use counter cpumask to skip zero values

From: Ian Rogers
Date: Thu Jan 09 2025 - 14:26:04 EST


On Wed, Jan 8, 2025 at 11:45 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> On Tue, Jan 07, 2025 at 09:34:26PM -0800, Ian Rogers wrote:
> > When a counter is 0 it may or may not be skipped. For uncore counters
> > it is common they are only valid on 1 logical CPU and all other CPUs
> > should be skipped. The PMU's cpumask was used for the skip
> > calculation, but that cpumask may not reflect user overrides.
>
> It's not clear to me how uncore PMU works with CPU overrides.
> I thought it's ignored and the kernel changed the CPU internally
> using the cpumask. But it may be transparent to userspace and
> we can think it works as what we expect.
>
> Anyway, the commit dd15480a3d67b9cf ("perf stat: Hide invalid uncore
> event output for aggr mode") added the code and the concern was like
>
> $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
>
> So it should be fine as long as the output remains the same.

Confirmed the output remains the same:
```
$ perf stat -a --per-core -e energy-pkg sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 1 22.94 Joules energy-pkg

1.000934566 seconds time elapsed
```

> > Similarly a counter on a core PMU may explicitly not
> > request a CPU be gathered. If the counter on this CPU's value is 0
> > then the counter should be skipped as it wasn't requested. Switch from
> > using the PMU cpumask to that associated with the evsel to support
> > these cases.
>
> Do you mean hybrid PMUs? I guess they won't open events on not
> supported/requested CPUs in the first place, right?

Right. The notion of uncore on a PMU is not the opposite of the notion
of core, it's all a bit of a muddle because of the kernel PMU drivers.
The previous code always shows 0 when `!pmu->is_uncore` and is_uncore
is set when a PMU has a `/sys/devices/<pmu name>/cpumask` file - core
PMUs should either have no cpumask or a cpus file instead. In general
the evsel cpumask should match the PMU cpumask. The change here is
that we will use the cpumask regardless of the PMU having or not
having the `/sys/devices/<pmu name>/cpumask` file, where not having
the file may reflect hybrid, a single core PMU, a PMU driver bug,
different core PMUs like AMD IBS and ARM SPE, etc. The output change
from this could be that a 0 on a `!pmu->is_uncore` PMU was previously
shown but now it is not. For that to happen the aggregation would need
to skip that CPU and as you say that shouldn't happen.

Thanks,
Ian