Re: [PATCH v2 2/2] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs

From: Gautham R. Shenoy
Date: Wed Aug 21 2024 - 00:57:33 EST


Hello Rafael,

On Mon, Aug 19, 2024 at 03:34:09PM +0200, Rafael J. Wysocki wrote:
> On Thu, Aug 8, 2024 at 1:18 PM Dhananjay Ugwekar
> <Dhananjay.Ugwekar@xxxxxxx> wrote:
> >
> > Hello Rafael,
> >
> > On 8/2/2024 6:05 PM, Rafael J. Wysocki wrote:
> > > On Tue, Jul 30, 2024 at 6:53 AM Dhananjay Ugwekar
> > > <Dhananjay.Ugwekar@xxxxxxx> wrote:
> > >>
> > >> After commit ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf"),
> > >> on AMD processors that support extended CPUID leaf 0x80000026, the
> > >> topology_logical_die_id() macros, no longer returns package id, instead it
> > >> returns the CCD (Core Complex Die) id. This leads to the energy-pkg
> > >> event scope to be modified to CCD instead of package.
> > >>
> > >> For more historical context, please refer to commit 32fb480e0a2c
> > >> ("powercap/intel_rapl: Support multi-die/package"), which initially changed
> > >> the RAPL scope from package to die for all systems, as Intel systems
> > >> with Die enumeration have RAPL scope as die, and those without die
> > >> enumeration are not affected. So, all systems(Intel, AMD, Hygon), worked
> > >> correctly with topology_logical_die_id() until recently, but this changed
> > >> after the "0x80000026 leaf" commit mentioned above.
> > >>
> > >> Future multi-die Intel systems will have package scope RAPL counters,
> > >> but they will be using TPMI RAPL interface, which is not affected by
> > >> this change.
> > >>
> > >> Replacing topology_logical_die_id() with topology_physical_package_id()
> > >> conditionally only for AMD and Hygon fixes the energy-pkg event.
> > >>
> > >> On an AMD 2 socket 8 CCD Zen4 server:
> > >>
> > >> Before:
> > >>
> > >> linux$ ls /sys/class/powercap/
> > >> intel-rapl intel-rapl:4 intel-rapl:8:0 intel-rapl:d
> > >> intel-rapl:0 intel-rapl:4:0 intel-rapl:9 intel-rapl:d:0
> > >> intel-rapl:0:0 intel-rapl:5 intel-rapl:9:0 intel-rapl:e
> > >> intel-rapl:1 intel-rapl:5:0 intel-rapl:a intel-rapl:e:0
> > >> intel-rapl:1:0 intel-rapl:6 intel-rapl:a:0 intel-rapl:f
> > >> intel-rapl:2 intel-rapl:6:0 intel-rapl:b intel-rapl:f:0
> > >> intel-rapl:2:0 intel-rapl:7 intel-rapl:b:0
> > >> intel-rapl:3 intel-rapl:7:0 intel-rapl:c
> > >> intel-rapl:3:0 intel-rapl:8 intel-rapl:c:0
> > >>
> > >> After:
> > >>
> > >> linux$ ls /sys/class/powercap/
> > >> intel-rapl intel-rapl:0 intel-rapl:0:0 intel-rapl:1 intel-rapl:1:0
> > >>
> > >> Only one sysfs entry per-event per-package is created after this change.
> > >>
> > >> Fixes: 63edbaa48a57 ("x86/cpu/topology: Add support for the AMD 0x80000026 leaf")
> > >> Reported-by: Michael Larabel <michael@xxxxxxxxxxxxxxxxxx>
> > >> Signed-off-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@xxxxxxx>
> > >> Reviewed-by: Zhang Rui <rui.zhang@xxxxxxxxx>
> > >> ---
> > >> Changes in v2:
> > >> * Updated scope description comment, commit log
> > >> * Rename rapl_pmu_is_pkg_scope() to rapl_msrs_are_pkg_scope()
> > >> * Check topology_logical_(die/package)_id return value
> > >
> > > This patch does not depend on the first one in the series if I'm not
> > > mistaken, in which case I can pick it up separately if you want me to
> > > do that, so please let me know.
> >
> > Sorry for the late reply, was out sick,
> >
> > Yes, please pick this patch separately, it is independent from the first one.
>
> OK, applied as 6.12 material.

Can this go into 6.11 fixes?

It fixes the commit 63edbaa48a57 ("x86/cpu/topology: Add support for
the AMD 0x80000026 leaf") which got merged in 6.10.

--
Thanks and Regards
gautham.