Re: [PATCH v7] soc: qcom: add l2 cache perf events driver

From: Mark Rutland
Date: Wed Nov 16 2016 - 05:38:17 EST


On Fri, Nov 11, 2016 at 04:52:35PM -0500, Leeder, Neil wrote:
> So is there a use-case for individual uncore PMUs when they can't be
> used in task mode or per-cpu?
>
> The main (only?) use will be in system mode, in which case surely it
> makes sense to provide a single aggregated count?

If you are aware of the system topology, the numbers may be more useful
than the summed count. If you aren't, it's still possible to sum them in
userspace.

Having them summed by the kernel means that the kernel is implying it
supports group semantics that it cannot, since it cannot start/stop all
counters in a group atomically if they're split across several units.

> With individual PMUs exposed there will be potentially dozens of
> nodes for userspace to collect from which would make perf
> command-line usage unwieldy at best.

FWIW, for uncore/system PMUs, even on x86 there are a number of
independent units.

Some PMUs (those which are symmetric across the topology) get hidden
behind the same struct PMU, but instances are still isolated (and their
values not summed).

Thanks,
Mark.