Re: [PATCH 0/5] intel_rapl & perf rapl: combine PMU support

From: Zhang, Rui
Date: Thu Feb 01 2024 - 00:35:22 EST


On Wed, 2024-01-31 at 15:40 +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 31, 2024 at 3:24 PM Zhang Rui <rui.zhang@xxxxxxxxx>
> wrote:
> >
> > This patch series is made based on the patch series posted at
> > https://lore.kernel.org/all/20240131113713.74779-1-rui.zhang@xxxxxxxxx/
> >
> > Problem statement
> > -----------------
> > MSR RAPL powercap sysfs is done in
> > drivers/powercap/intel_rapl_msr.c.
> > MSR RAPL PMU is done in arch/x86/events/rapl.c.
> >
> > They maintain two separate CPU model lists, describing the same
> > feature
> > available on the same set of hardware. This increases unnecessary
> > maintenance burden a lot.
> >
> > Now we need to introduce TPMI RAPL PMU support, which again shares
> > most
> > of the logic with MSR RAPL PMU.
> >
> > Solution
> > --------
> > Introducing PMU support as part of RAPL framework and remove
> > current MSR
> > RAPL PMU code.
> >
> > The idea is that, if a RAPL Package device is registered to RAPL
> > framework, and is ready for energy reporting and control via
> > powercap
> > sysfs, then it is also ready for PMU.
> >
> > So introducing PMU support in RAPL framework that works for all
> > registered RAPL Package devices. With this, we can remove current
> > MSR
> > RAPL PMU completely.
> >
> > Given that MSR RAPL and TPMI RAPL driver won't funtion on the same
> > platform, the new RAPL PMU can be fully compatible with current MSR
> > RAPL
> > PMU, including using the same PMU name and events
> > name/id/unit/scale.
> >
> > For example, on platforms use either MSR or TPMI, use the same
> > command
> >  perf stat -e power/energy-pkg/ -e power/energy-ram/ -e
> > power/energy-cores/ FOO
> > to get the energy consumption when the events are in "perf list"
> > output.
> >
> > Notes
> > -----
> > There are indeed some functional changes introduced, due to the
> > divergency between the two CPU model lists. This includes,
> > 1. Fix BROADWELL_D in intel_rapl driver to use fixed Dram domain
> > energy
> >    unit.
> > 2. Enable PMU for some Intel platforms, which were missing in
> >    arch/x86/events/rapl.c. This includes
> >         ICELAKE_NNPI
> >         ROCKETLAKE
> >         LUNARLAKE_M
> >         LAKEFIELD
> >         ATOM_SILVERMONT
> >         ATOM_SILVERMONT_MID
> >         ATOM_AIRMONT
> >         ATOM_AIRMONT_MID
> >         ATOM_TREMONT
> >         ATOM_TREMONT_D
> >         ATOM_TREMONT_L
> > 3. Change the logic for enumerating AMD/HYGON platforms
> >    Previously, it was
> >         X86_MATCH_FEATURE(X86_FEATURE_RAPL,            
> > &model_amd_hygon)
> >    And now it is
> >         X86_MATCH_VENDOR_FAM(AMD, 0x17, &rapl_defaults_amd)
> >         X86_MATCH_VENDOR_FAM(AMD, 0x19, &rapl_defaults_amd)
> >         X86_MATCH_VENDOR_FAM(HYGON, 0x18, &rapl_defaults_amd)
> >
> > Any comments/concerns are welcome.
>
> Say the first patch in the series is applied and the last one is not.
> Will anything break?

No. Without the last patch
1. for platforms using TPMI RAPL, .enable_pmu flag is set and PMU is
registered via RAPL framework
2. for platforms using MSR RAPL, it doesn't set .enable_pmu flag, and
the PMU is registered by arch/x86/events/rapl.c

>
> Regardless of the above. if any existing code is moved unmodified by
> this series to a new location,

intel_rapl PMU support shares a lot of code with
arch/x86/events/rapl.c, but still there are quite a lot of differences.
Including
1. dynamic PMU probing
2. using intel_rapl wrappers to get energy units and read energy
counter
etc.

thanks,
rui
> it would be nice to be able to see that
> in the patches.  Otherwise, some subtle differences may be missed.