Re: [PATCH v2 3/4] powercap: Add AMD Fam17h RAPL support
From: Zhang Rui
Date: Thu Oct 08 2020 - 23:47:10 EST
On Wed, 2020-10-07 at 11:14 -0500, Kim Phillips wrote:
> From: Victor Ding <victording@xxxxxxxxxx>
>
> This patch enables AMD Fam17h RAPL support for the power capping
> framework. The support is as per AMD Fam17h Model31h (Zen2) and
> model 00-ffh (Zen1) PPR.
>
> Tested by comparing the results of following two sysfs entries and
> the
> values directly read from corresponding MSRs via /dev/cpu/[x]/msr:
> /sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj
> /sys/class/powercap/intel-rapl/intel-rapl:0/intel-
> rapl:0:0/energy_uj
>
> Signed-off-by: Victor Ding <victording@xxxxxxxxxx>
> Acked-by: Kim Phillips <kim.phillips@xxxxxxx>
> Cc: Victor Ding <victording@xxxxxxxxxx>
> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
> Cc: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
> Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> Cc: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Tony Luck <tony.luck@xxxxxxxxx>
> Cc: Vineela Tummalapalli <vineela.tummalapalli@xxxxxxxxx>
> Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
> Cc: linux-pm@xxxxxxxxxxxxxxx
> Cc: x86@xxxxxxxxxx
> ---
> Kim's changes from Victor's original submission:
>
>
https://lore.kernel.org/lkml/20200729205144.3.I01b89fb23d7498521c84cfdf417450cbbfca46bb@changeid/
>
> - Added my Acked-by.
> - Added Daniel Lezcano to Cc.
>
> arch/x86/include/asm/msr-index.h | 1 +
> drivers/powercap/intel_rapl_common.c | 2 ++
> drivers/powercap/intel_rapl_msr.c | 27
> ++++++++++++++++++++++++++-
> 3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h
> b/arch/x86/include/asm/msr-index.h
> index f1b24f1b774d..c0646f69d2a5 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -324,6 +324,7 @@
> #define MSR_PP1_POLICY 0x00000642
>
> #define MSR_AMD_RAPL_POWER_UNIT 0xc0010299
> +#define MSR_AMD_CORE_ENERGY_STATUS 0xc001029a
> #define MSR_AMD_PKG_ENERGY_STATUS 0xc001029b
>
> /* Config TDP MSRs */
> diff --git a/drivers/powercap/intel_rapl_common.c
> b/drivers/powercap/intel_rapl_common.c
> index 983d75bd5bd1..6905ccffcec3 100644
> --- a/drivers/powercap/intel_rapl_common.c
> +++ b/drivers/powercap/intel_rapl_common.c
> @@ -1054,6 +1054,8 @@ static const struct x86_cpu_id rapl_ids[]
> __initconst = {
>
> X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &rapl_defaults_hsw_se
> rver),
> X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &rapl_defaults_hsw_se
> rver),
> +
> + X86_MATCH_VENDOR_FAM(AMD, 0x17, &rapl_defaults_core),
I double if we can use rapl_defaults_core here.
static const struct rapl_defaults rapl_defaults_core = {
.floor_freq_reg_addr = 0,
.check_unit = rapl_check_unit_core,
.set_floor_freq = set_floor_freq_default,
.compute_time_window = rapl_compute_time_window_core,
};
.floor_freq_reg_addr = 0,
is redundant here, even for rapl_defaults_core, we can remove it.
.check_unit = rapl_check_unit_core,
the Intel UNIT MSR supports three units including Energy/Power/Time.
>From the change below, only the energy counter is supported, so you may
need to confirm if all the three units are supported or not.
.set_floor_freq = set_floor_freq_default,this function sets PL1_CLAMP bit on RAPL_DOMAIN_REG_LIMIT, but RAPL_DOMAIN_REG_LIMIT is not supported on the AMD cpus.
.compute_time_window = rapl_compute_time_window_core,
this is used for setting the power limits, which is not supported on
the AMD cpus.
IMO, it's better to use a new rapl_defaults that contains valid
callbacks for AMD cpus.
> {}
> };
> MODULE_DEVICE_TABLE(x86cpu, rapl_ids);
> diff --git a/drivers/powercap/intel_rapl_msr.c
> b/drivers/powercap/intel_rapl_msr.c
> index c68ef5e4e1c4..dcaef917f79d 100644
> --- a/drivers/powercap/intel_rapl_msr.c
> +++ b/drivers/powercap/intel_rapl_msr.c
> @@ -48,6 +48,21 @@ static struct rapl_if_priv rapl_msr_priv_intel = {
> .limits[RAPL_DOMAIN_PACKAGE] = 2,
> };
>
> +static struct rapl_if_priv rapl_msr_priv_amd = {
> + .reg_unit = MSR_AMD_RAPL_POWER_UNIT,
> + .regs[RAPL_DOMAIN_PACKAGE] = {
> + 0, MSR_AMD_PKG_ENERGY_STATUS, 0, 0, 0 },
> + .regs[RAPL_DOMAIN_PP0] = {
> + 0, MSR_AMD_CORE_ENERGY_STATUS, 0, 0, 0 },
> + .regs[RAPL_DOMAIN_PP1] = {
> + 0, 0, 0, 0, 0 },
> + .regs[RAPL_DOMAIN_DRAM] = {
> + 0, 0, 0, 0, 0 },
> + .regs[RAPL_DOMAIN_PLATFORM] = {
> + 0, 0, 0, 0, 0},
I don't think you need to set the PP1/DRAM/PLATFORM registers to 0 explicitly if they are not supported.
> + .limits[RAPL_DOMAIN_PACKAGE] = 1,
Is Pkg Domain PL1 really supported?
At least according to this patch, I don't think so. So if power limit
is supported, it is better to add the support in this patch altogether.
If no, we're actually exposing energy counters only. If this is the
case, I'm not sure if it is proper to do this in powercap class because
we can not do powercap actually. Or at least, if we want to support
power zones with no power limits, we should enhance the code to
support this rather than fake a power limit.
thanks,
rui
> +};
> +
> /* Handles CPU hotplug on multi-socket systems.
> * If a CPU goes online as the first CPU of the physical package
> * we add the RAPL package to the system. Similarly, when the last
> @@ -137,7 +152,17 @@ static int rapl_msr_probe(struct platform_device
> *pdev)
> const struct x86_cpu_id *id = x86_match_cpu(pl4_support_ids);
> int ret;
>
> - rapl_msr_priv = &rapl_msr_priv_intel;
> + switch (boot_cpu_data.x86_vendor) {
> + case X86_VENDOR_INTEL:
> + rapl_msr_priv = &rapl_msr_priv_intel;
> + break;
> + case X86_VENDOR_AMD:
> + rapl_msr_priv = &rapl_msr_priv_amd;
> + break;
> + default:
> + pr_err("intel-rapl does not support CPU vendor %d\n",
> boot_cpu_data.x86_vendor);
> + return -ENODEV;
> + }
> rapl_msr_priv->read_raw = rapl_msr_read_raw;
> rapl_msr_priv->write_raw = rapl_msr_write_raw;
>
IF