[PATCH v5 00/22] cpufreq: introduce a new AMD CPU frequency control mechanism

From: Huang Rui
Date: Tue Nov 30 2021 - 07:37:13 EST


Hi all,

We would like to introduce a new AMD CPU frequency control mechanism as the
"amd-pstate" driver for modern AMD Zen based CPU series in Linux Kernel.
The new mechanism is based on Collaborative processor performance control
(CPPC) which is finer grain frequency management than legacy ACPI hardware
P-States. Current AMD CPU platforms are using the ACPI P-states driver to
manage CPU frequency and clocks with switching only in 3 P-states. AMD
P-States is to replace the ACPI P-states controls, allows a flexible,
low-latency interface for the Linux kernel to directly communicate the
performance hints to hardware.

"amd-pstate" leverages the Linux kernel governors such as *schedutil*,
*ondemand*, etc. to manage the performance hints which are provided by CPPC
hardware functionality. The first version for amd-pstate is to support one
of the Zen3 processors, and we will support more in future after we verify
the hardware and SBIOS functionalities.

There are two types of hardware implementations for amd-pstate: one is full
MSR support and another is shared memory support. It can use
X86_FEATURE_CPPC feature flag to distinguish the different types.

Using the new AMD P-States method + kernel governors (*schedutil*,
*ondemand*, ...) to manage the frequency update is the most appropriate
bridge between AMD Zen based hardware processor and Linux kernel, the
processor is able to adjust to the most efficiency frequency according to
the kernel scheduler loading.

Please check the detailed CPU feature and MSR register description in
Processor Programming Reference (PPR) for AMD Family 19h Model 51h,
Revision A1 Processors:

https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip

Performance Per Watt (PPW) Calculation:

We use the RAPL interface with "perf" tool to get the energy data of the
package power.

The data comparisons between amd-pstate and acpi-freq module are tested on
AMD Cezanne processor (mobile CPU):

1) TBench CPU benchmark:

+----------------------------------------------------------------------------------------------+
| |
| TBench4 (Performance Per Watt) |
| Higher is better |
+-------------------+------------------------+------------------------+------------------------+
| | Performance Per Watt | Performance Per Watt | Performance Per Watt |
| Kernel Module | (Schedutil) | (Ondemand) | (Performance) |
| | Unit: MB / J | Unit: MB / J | Unit: MB / J |
+-------------------+------------------------+------------------------+------------------------+
| | | | |
| acpi-cpufreq | 48.56 | 48.89 | 47.81 |
| | | | |
+-------------------+------------------------+------------------------+------------------------+
| | | | |
| amd-pstate | 48.38 | 47.38 | 48.77 |
| | | | |
+-------------------+------------------------+------------------------+------------------------+

Note: The previous data was based on TBench2, as align with Suse, we use
TBench4 to re-test it. The PPW is very closed to acpi-cpufreq. And we are
still re-runing other tests.

Steam Game Demo on Ryzen 5900X (12 core 24 threads):

The picture to compare acpi-cpufreq vs amd-pstate:
https://drive.google.com/file/d/1PvSduykJn9U5MMOhzFWycnbmGmznalM3/view?usp=sharing

Two videos:
https://drive.google.com/file/d/1nQQEteL-v_zQxnOJpyW8JqvRW2FFDN2Z/view?usp=sharing
https://drive.google.com/file/d/1heuPgFG71SQHvGb6wfedrQciBfE2rhnu/view?usp=sharing

Actually, the amd-pstate driver doesn't change the physical maximum
frequency capacity in the processor. But it's able to provide the finer
grain performance control range instead of legacy 3 P-States. It has a
better performance and power efficiency than before. We will continue
optimize amd-pstate function on kernel governors to support different types
of processors such as mobile latop, performance desktop, and etc.

See patch series in below git repo:
V1: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v1
V2: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v2
V3: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v3
V4: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v4
V5: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=amd-pstate-dev-v5

For details introduction, please see the patch 22.

Changes from V1 -> V2:
- cpufreq:
- - Add detailed description in the commit log.
- - Clean up the "extension" postfix in the x86 feature flag.
- - Revise cppc_set_enable helper.
- - Add a fix to check online cpus in cppc_acpi.
- - Use static calls to avoid retpolines.
- - Revise the comment style.
- - Remove amd_pstate_boost_supported() function.
- - Revise the return value in sysfs attribute functions.
- cpupower:
- - Refine the commit log for cpupower patches.
- - Expose a function to get the sysfs value from specific table.
- - Move amd-pstate sysfs definitions and functions into amd helper file.
- - Move the boost init function into amd helper file and explain the
details in the commit log.
- - Remove the amd_pstate_get_data in the lib/cpufreq.c to keep the lib as
common operations.
- - Move print_speed function into misc helper file.
- - Add amd_pstate_show_perf_and_freq() function in amd helper for
cpufreq-info print.

Changes from V2 -> V3:
- cpufreq:
- - Add a patch from Steven to add systemio register in cppc lib. (Thanks
to verify the driver in his platform)
- - Update online cpu mask to present cpu.
- - Enhance cppc_set_enable to cover all valid use cases.
- - Add more description in the Kconfig definition.
- - Clean up some redundance functions and data members.
- - Revise amd-pstate trace event prints.
- - Move the amd-pstate traces into power trace system and set the driver
as build-in instead of module.
- - Clean up the duplicated sysfs with core cpufreq driver.
- - Revise the amd-pstate RST documentation.
- cpupower:
- - Revise the cpupower_amd_pstate_enabled() function to use
cpufreq_get_driver helper instead of read sysfs.
- - Clean up the amd-pstate max/min frequency APIs, because they are
actually the same with cpufreq info sysfs.

Changes from V3 -> V4:
- cpufreq:
- - Rebase the whole series to Rafael's pm branch (5.15)
- - Fix the typo in the commit message and comment.
- - Clean up function implementation.
- - Clean up freq&perf sysfs APIs.
- - Fall back to move amd-pstate traces out of power trace system, because
it's flexible to debug and fine tune processors with the shared memory
solution.
- - Add a kernel param to disable shared memory on amd-pstate, it can be
enabled manually.
- cpupower:
- - Introduce acpi cppc library support.
- - Clean up the duplicated amd specific perf/frequency.

We received two issues that reported on the processors with shared memory
solution from the community.

First one:
https://lore.kernel.org/linux-pm/a0e932477e9b826c0781dda1d0d2953e57f904cc.camel@xxxxxxx/

Thanks Giovanni's support and suggestions, we corrected the calculation
method and duplicated the performance issue in a 64 cores / 128 threads
threadripper which is similiar with EPYC 7713. Check with firmware guy, the
sample rate is about 1 ms that the hardware responds the frequency change.
We are working to simulate the acpi pstates with cppc api and checking
where is the gap with acpi-cpufreq. Will share you the status later.

Second one:
https://lore.kernel.org/linux-pm/f9323c6fddd4a55d8ca4191a9539ebd056221045.camel@xxxxxxxxx/

Thanks Matt to help us reproduce this issue in our side with Ryzen 5900X.
Below is the video to compare with amd-pstate and acpi-cpufreq on "Control"
steam games. The FPS can be increased as well. It is probably something
wrong in the SBIOS.
https://lore.kernel.org/linux-pm/DM4PR12MB5136084E5DC1809FBB578075F1959@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Due to some issues on shared memory solution, so we fallback to build as a
module and add a parameter "shared_mem" to disable amd-pstate on processors
with shared memory solution for the moment till we complete tunning on this
function.

Changes from V4 -> V5:
- cpufreq:
- - Fix subject typo.
- - Fix Clang-13 build warning.
- - Update RST documentation for the latest update.
- cpupower:
- - Fix the table check condition at cpufreq_get_sysfs_value_from_table.

Thanks,
Ray


Huang Rui (19):
x86/cpufeatures: add AMD Collaborative Processor Performance Control
feature flag
x86/msr: add AMD CPPC MSR definitions
cpufreq: amd: introduce a new amd pstate driver to support future
processors
cpufreq: amd: add fast switch function for amd-pstate
cpufreq: amd: introduce the support for the processors with shared
memory solution
cpufreq: amd: add trace for amd-pstate module
cpufreq: amd: add boost mode support for amd-pstate
cpufreq: amd: add amd-pstate frequencies attributes
cpufreq: amd: add amd-pstate performance attributes
cpupower: add AMD P-state capability flag
cpupower: add the function to check amd-pstate enabled
cpupower: initial AMD P-state capability
cpupower: add the function to get the sysfs value from specific table
cpupower: introduce acpi cppc library
cpupower: add amd-pstate sysfs definition and access helper
cpupower: enable boost state support for amd-pstate module
cpupower: move print_speed function into misc helper
cpupower: print amd-pstate information on cpupower
Documentation: amd-pstate: add amd-pstate driver introduction

Jinzhou Su (1):
ACPI: CPPC: add cppc enable register function

Mario Limonciello (1):
ACPI: CPPC: Check present CPUs for determining _CPC is valid

Steven Noonan (1):
ACPI: CPPC: implement support for SystemIO registers

Documentation/admin-guide/acpi/cppc_sysfs.rst | 2 +
Documentation/admin-guide/pm/amd-pstate.rst | 383 +++++++++++
.../admin-guide/pm/working-state.rst | 1 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 17 +
drivers/acpi/cppc_acpi.c | 93 ++-
drivers/cpufreq/Kconfig.x86 | 17 +
drivers/cpufreq/Makefile | 5 +
drivers/cpufreq/amd-pstate-trace.c | 2 +
drivers/cpufreq/amd-pstate-trace.h | 77 +++
drivers/cpufreq/amd-pstate.c | 609 ++++++++++++++++++
include/acpi/cppc_acpi.h | 5 +
tools/power/cpupower/Makefile | 6 +-
tools/power/cpupower/lib/acpi_cppc.c | 59 ++
tools/power/cpupower/lib/acpi_cppc.h | 21 +
tools/power/cpupower/lib/cpufreq.c | 21 +-
tools/power/cpupower/lib/cpufreq.h | 12 +
tools/power/cpupower/utils/cpufreq-info.c | 68 +-
tools/power/cpupower/utils/helpers/amd.c | 76 +++
tools/power/cpupower/utils/helpers/cpuid.c | 13 +
tools/power/cpupower/utils/helpers/helpers.h | 22 +
tools/power/cpupower/utils/helpers/misc.c | 62 ++
22 files changed, 1508 insertions(+), 64 deletions(-)
create mode 100644 Documentation/admin-guide/pm/amd-pstate.rst
create mode 100644 drivers/cpufreq/amd-pstate-trace.c
create mode 100644 drivers/cpufreq/amd-pstate-trace.h
create mode 100644 drivers/cpufreq/amd-pstate.c
create mode 100644 tools/power/cpupower/lib/acpi_cppc.c
create mode 100644 tools/power/cpupower/lib/acpi_cppc.h

--
2.25.1