[PATCH v4 00/11] Add per-core RAPL energy counter support for AMD CPUs

From: Dhananjay Ugwekar
Date: Thu Jul 11 2024 - 06:27:04 EST


Currently the energy-cores event in the power PMU aggregates energy
consumption data at a package level. On the other hand the core energy
RAPL counter in AMD CPUs has a core scope (which means the energy
consumption is recorded separately for each core). Earlier efforts to add
the core event in the power PMU had failed [1], due to the difference in
the scope of these two events. Hence, there is a need for a new core scope
PMU.

This patchset adds a new "power_per_core" PMU alongside the existing
"power" PMU, which will be responsible for collecting the new
"energy-per-core" event.

Tested the package level and core level PMU counters with workloads
pinned to different CPUs.

Results with workload pinned to CPU 1 in Core 1 on an AMD Zen4 Genoa
machine:

$ perf stat -a --per-core -e power_per_core/energy-per-core/ -- sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C1 1 5.72 Joules power_per_core/energy-per-core/
S0-D0-C2 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C3 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C4 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C5 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C6 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C7 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C8 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C9 1 0.02 Joules power_per_core/energy-per-core/
S0-D0-C10 1 0.02 Joules power_per_core/energy-per-core/

[1]: https://lore.kernel.org/lkml/3e766f0e-37d4-0f82-3868-31b14228868d@xxxxxxxxxxxxxxx/

This patchset applies cleanly on top of v6.10-rc7 as well as latest
tip/master.

v4 changes:
* Add patch 11 which removes the unused function cpu_to_rapl_pmu()
* Add Rui's rb tag for patch 1
* Invert the pmu scope check logic in patch 2 (Peter)
* Add comments explaining the scope check in patch 2 (Peter)
* Use cpumask_var_t instead of cpumask_t in patch 5 (Peter)
* Move renaming code to patch 8 (Rui)
* Reorder the cleanup order of per-core and per-pkg PMU in patch 10 (Rui)
* Add rapl_core_hw_unit variable to store the per-core PMU unit in patch
10 (Rui)

PS: Scope check logic is still kept the same (i.e., all Intel systems being
considered as die scope), Rui will be modifying it to limit the die-scope
only to Cascadelake-AP in a future patch on top of this patchset.

v3 changes:
* Patch 1 added to introduce the logical_core_id which is unique across
the system (Prateek)
* Use the unique topology_logical_core_id() instead of
topology_core_id() (which is only unique within a package on tested
AMD and Intel systems) in Patch 10

v2 changes:
* Patches 6,7,8 added to split some changes out of the last patch
* Use container_of to get the rapl_pmus from event variable (Rui)
* Set PERF_EV_CAP_READ_ACTIVE_PKG flag only for pkg scope PMU (Rui)
* Use event id 0x1 for energy-per-core event (Rui)
* Use PERF_RAPL_PER_CORE bit instead of adding a new flag to check for
per-core counter hw support (Rui)

Dhananjay Ugwekar (10):
perf/x86/rapl: Fix the energy-pkg event for AMD CPUs
perf/x86/rapl: Rename rapl_pmu variables
perf/x86/rapl: Make rapl_model struct global
perf/x86/rapl: Move cpumask variable to rapl_pmus struct
perf/x86/rapl: Add wrapper for online/offline functions
perf/x86/rapl: Add an argument to the cleanup and init functions
perf/x86/rapl: Modify the generic variable names to *_pkg*
perf/x86/rapl: Remove the global variable rapl_msrs
perf/x86/rapl: Add per-core energy counter support for AMD CPUs
perf/x86/rapl: Remove the unused function cpu_to_rapl_pmu

K Prateek Nayak (1):
x86/topology: Introduce topology_logical_core_id()

Documentation/arch/x86/topology.rst | 4 +
arch/x86/events/rapl.c | 454 ++++++++++++++++++--------
arch/x86/include/asm/processor.h | 1 +
arch/x86/include/asm/topology.h | 1 +
arch/x86/kernel/cpu/debugfs.c | 1 +
arch/x86/kernel/cpu/topology_common.c | 1 +
6 files changed, 328 insertions(+), 134 deletions(-)

--
2.34.1