RE: [PATCH V8 0/7] amd-pstate preferred core

From: Meng, Li (Jassmine)
Date: Mon Oct 09 2023 - 22:15:17 EST


[AMD Official Use Only - General]

Hi Oleksandr:

> -----Original Message-----
> From: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
> Sent: Monday, October 9, 2023 9:00 PM
> To: Rafael J . Wysocki <rafael.j.wysocki@xxxxxxxxx>; Huang, Ray
> <Ray.Huang@xxxxxxx>; Meng, Li (Jassmine) <Li.Meng@xxxxxxx>
> Cc: linux-pm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> x86@xxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; Shuah Khan
> <skhan@xxxxxxxxxxxxxxxxxxx>; linux-kselftest@xxxxxxxxxxxxxxx; Fontenot,
> Nathan <Nathan.Fontenot@xxxxxxx>; Sharma, Deepak
> <Deepak.Sharma@xxxxxxx>; Deucher, Alexander
> <Alexander.Deucher@xxxxxxx>; Limonciello, Mario
> <Mario.Limonciello@xxxxxxx>; Huang, Shimmer
> <Shimmer.Huang@xxxxxxx>; Yuan, Perry <Perry.Yuan@xxxxxxx>; Du,
> Xiaojian <Xiaojian.Du@xxxxxxx>; Viresh Kumar <viresh.kumar@xxxxxxxxxx>;
> Borislav Petkov <bp@xxxxxxxxx>
> Subject: Re: [PATCH V8 0/7] amd-pstate preferred core
>
> Hello.
>
> On pondělí 9. října 2023 9:23:29 CEST Meng, Li (Jassmine) wrote:
> > [AMD Official Use Only - General]
> >
> > Hi Oleksandr:
> >
> > > -----Original Message-----
> > > From: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
> > > Sent: Monday, October 9, 2023 2:55 PM
> > > To: Rafael J . Wysocki <rafael.j.wysocki@xxxxxxxxx>; Huang, Ray
> > > <Ray.Huang@xxxxxxx>; Meng, Li (Jassmine) <Li.Meng@xxxxxxx>
> > > Cc: linux-pm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > x86@xxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; Shuah Khan
> > > <skhan@xxxxxxxxxxxxxxxxxxx>; linux-kselftest@xxxxxxxxxxxxxxx;
> > > Fontenot, Nathan <Nathan.Fontenot@xxxxxxx>; Sharma, Deepak
> > > <Deepak.Sharma@xxxxxxx>; Deucher, Alexander
> > > <Alexander.Deucher@xxxxxxx>; Limonciello, Mario
> > > <Mario.Limonciello@xxxxxxx>; Huang, Shimmer
> <Shimmer.Huang@xxxxxxx>;
> > > Yuan, Perry <Perry.Yuan@xxxxxxx>; Du, Xiaojian
> > > <Xiaojian.Du@xxxxxxx>; Viresh Kumar <viresh.kumar@xxxxxxxxxx>;
> > > Borislav Petkov <bp@xxxxxxxxx>; Meng, Li (Jassmine)
> > > <Li.Meng@xxxxxxx>
> > > Subject: Re: [PATCH V8 0/7] amd-pstate preferred core
> > >
> > > Hello.
> > >
> > > On pondělí 9. října 2023 4:49:25 CEST Meng Li wrote:
> > > > Hi all:
> > > >
> > > > The core frequency is subjected to the process variation in
> semiconductors.
> > > > Not all cores are able to reach the maximum frequency respecting
> > > > the infrastructure limits. Consequently, AMD has redefined the
> > > > concept of maximum frequency of a part. This means that a fraction
> > > > of cores can reach maximum frequency. To find the best process
> > > > scheduling policy for a given scenario, OS needs to know the core
> > > > ordering informed by the platform through highest performance
> > > > capability register of the CPPC
> > > interface.
> > > >
> > > > Earlier implementations of amd-pstate preferred core only support
> > > > a static core ranking and targeted performance. Now it has the
> > > > ability to dynamically change the preferred core based on the
> > > > workload and platform conditions and accounting for thermals and
> aging.
> > > >
> > > > Amd-pstate driver utilizes the functions and data structures
> > > > provided by the ITMT architecture to enable the scheduler to favor
> > > > scheduling on cores which can be get a higher frequency with lower
> voltage.
> > > > We call it amd-pstate preferred core.
> > > >
> > > > Here sched_set_itmt_core_prio() is called to set priorities and
> > > > sched_set_itmt_support() is called to enable ITMT feature.
> > > > Amd-pstate driver uses the highest performance value to indicate
> > > > the priority of CPU. The higher value has a higher priority.
> > > >
> > > > Amd-pstate driver will provide an initial core ordering at boot time.
> > > > It relies on the CPPC interface to communicate the core ranking to
> > > > the operating system and scheduler to make sure that OS is
> > > > choosing the cores with highest performance firstly for scheduling the
> process.
> > > > When amd-pstate driver receives a message with the highest
> > > > performance change, it will update the core ranking.
> > > >
> > > > Changes form V7->V8:
> > > > - all:
> > > > - - pick up Review-By flag added by Mario and Ray.
> > > > - cpufreq: amd-pstate:
> > > > - - use hw_prefcore embeds into cpudata structure.
> > > > - - delete preferred core init from cpu online/off.
> > >
> > > Could you please let me know if this change means a fix for the
> > > report I've sent previously? [1]
> > >
> > [Meng, Li (Jassmine)] Yes.
> > I have deleted online handle function of amd pstate driver.
> > It doesn't re-initialize preferred core.
> > This online function will set incorrect des perf value.
>
> Thank you for the confirmation. I've built v6.5.5 with this patchset applied,
> and now the frequency is as expected after the suspend-resume cycle.
>
> I've also added the following modification to accommodate recent feedback:
>
> ```
> commit 1450ac395434c532f995521e1a2497d09ddf106c
> Author: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
> Date: Mon Oct 9 11:19:50 2023 +0200
>
> cpufreq/amd-pstate: show prefcore_ranking separately
>
> Signed-off-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index d3369247c6c9c..86999d861e87b 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -954,6 +954,17 @@ static ssize_t show_amd_pstate_highest_perf(struct
> cpufreq_policy *policy,
> u32 perf;
> struct amd_cpudata *cpudata = policy->driver_data;
>
> + perf = READ_ONCE(cpudata->highest_perf);
> +
> + return sysfs_emit(buf, "%u\n", perf);
> +}
> +
> +static ssize_t show_amd_pstate_prefcore_ranking(struct cpufreq_policy
> *policy,
> + char *buf)
> +{
> + u32 perf;
> + struct amd_cpudata *cpudata = policy->driver_data;
> +
> perf = READ_ONCE(cpudata->prefcore_ranking);
>
> return sysfs_emit(buf, "%u\n", perf);
> @@ -1172,6 +1183,7 @@ cpufreq_freq_attr_ro(amd_pstate_max_freq);
> cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
>
> cpufreq_freq_attr_ro(amd_pstate_highest_perf);
> +cpufreq_freq_attr_ro(amd_pstate_prefcore_ranking);
> cpufreq_freq_attr_ro(amd_pstate_hw_prefcore);
> cpufreq_freq_attr_rw(energy_performance_preference);
> cpufreq_freq_attr_ro(energy_performance_available_preferences);
> @@ -1182,6 +1194,7 @@ static struct freq_attr *amd_pstate_attr[] = {
> &amd_pstate_max_freq,
> &amd_pstate_lowest_nonlinear_freq,
> &amd_pstate_highest_perf,
> + &amd_pstate_prefcore_ranking,
> &amd_pstate_hw_prefcore,
> NULL,
> };
> @@ -1190,6 +1203,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = {
> &amd_pstate_max_freq,
> &amd_pstate_lowest_nonlinear_freq,
> &amd_pstate_highest_perf,
> + &amd_pstate_prefcore_ranking,
> &amd_pstate_hw_prefcore,
> &energy_performance_preference,
> &energy_performance_available_preferences,
> ```
>
> with the following output as a result:
>
> ```
> [~]> grep .
> /sys/devices/system/cpu*/cpufreq/policy*/amd_pstate_highest_perf
> /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy1/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy2/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy3/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy4/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy5/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy6/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy7/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy8/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy9/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy10/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy11/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy12/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy13/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy14/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy15/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy16/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy17/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy18/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy19/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy20/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy21/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy22/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy23/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy24/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy25/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy26/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy27/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy28/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy29/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy30/amd_pstate_highest_perf:166
> /sys/devices/system/cpu/cpufreq/policy31/amd_pstate_highest_perf:166
>
> [~]> grep .
> /sys/devices/system/cpu*/cpufreq/policy*/amd_pstate_hw_prefcore
> /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy1/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy2/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy3/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy4/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy5/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy6/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy7/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy8/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy9/amd_pstate_hw_prefcore:suppo
> rted
> /sys/devices/system/cpu/cpufreq/policy10/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy11/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy12/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy13/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy14/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy15/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy16/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy17/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy18/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy19/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy20/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy21/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy22/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy23/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy24/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy25/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy26/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy27/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy28/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy29/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy30/amd_pstate_hw_prefcore:supp
> orted
> /sys/devices/system/cpu/cpufreq/policy31/amd_pstate_hw_prefcore:supp
> orted
>
> [~]> grep .
> /sys/devices/system/cpu*/cpufreq/policy*/amd_pstate_prefcore_ranking
> /sys/devices/system/cpu/cpufreq/policy0/amd_pstate_prefcore_ranking:22
> 6
> /sys/devices/system/cpu/cpufreq/policy1/amd_pstate_prefcore_ranking:23
> 1
> /sys/devices/system/cpu/cpufreq/policy2/amd_pstate_prefcore_ranking:21
> 1
> /sys/devices/system/cpu/cpufreq/policy3/amd_pstate_prefcore_ranking:23
> 6
> /sys/devices/system/cpu/cpufreq/policy4/amd_pstate_prefcore_ranking:21
> 6
> /sys/devices/system/cpu/cpufreq/policy5/amd_pstate_prefcore_ranking:23
> 6
> /sys/devices/system/cpu/cpufreq/policy6/amd_pstate_prefcore_ranking:20
> 6
> /sys/devices/system/cpu/cpufreq/policy7/amd_pstate_prefcore_ranking:22
> 1
> /sys/devices/system/cpu/cpufreq/policy8/amd_pstate_prefcore_ranking:19
> 1
> /sys/devices/system/cpu/cpufreq/policy9/amd_pstate_prefcore_ranking:20
> 1
> /sys/devices/system/cpu/cpufreq/policy10/amd_pstate_prefcore_ranking:1
> 86
> /sys/devices/system/cpu/cpufreq/policy11/amd_pstate_prefcore_ranking:1
> 96
> /sys/devices/system/cpu/cpufreq/policy12/amd_pstate_prefcore_ranking:1
> 71
> /sys/devices/system/cpu/cpufreq/policy13/amd_pstate_prefcore_ranking:1
> 66
> /sys/devices/system/cpu/cpufreq/policy14/amd_pstate_prefcore_ranking:1
> 76
> /sys/devices/system/cpu/cpufreq/policy15/amd_pstate_prefcore_ranking:1
> 81
> /sys/devices/system/cpu/cpufreq/policy16/amd_pstate_prefcore_ranking:2
> 26
> /sys/devices/system/cpu/cpufreq/policy17/amd_pstate_prefcore_ranking:2
> 31
> /sys/devices/system/cpu/cpufreq/policy18/amd_pstate_prefcore_ranking:2
> 11
> /sys/devices/system/cpu/cpufreq/policy19/amd_pstate_prefcore_ranking:2
> 36
> /sys/devices/system/cpu/cpufreq/policy20/amd_pstate_prefcore_ranking:2
> 16
> /sys/devices/system/cpu/cpufreq/policy21/amd_pstate_prefcore_ranking:2
> 36
> /sys/devices/system/cpu/cpufreq/policy22/amd_pstate_prefcore_ranking:2
> 06
> /sys/devices/system/cpu/cpufreq/policy23/amd_pstate_prefcore_ranking:2
> 21
> /sys/devices/system/cpu/cpufreq/policy24/amd_pstate_prefcore_ranking:1
> 91
> /sys/devices/system/cpu/cpufreq/policy25/amd_pstate_prefcore_ranking:2
> 01
> /sys/devices/system/cpu/cpufreq/policy26/amd_pstate_prefcore_ranking:1
> 86
> /sys/devices/system/cpu/cpufreq/policy27/amd_pstate_prefcore_ranking:1
> 96
> /sys/devices/system/cpu/cpufreq/policy28/amd_pstate_prefcore_ranking:1
> 71
> /sys/devices/system/cpu/cpufreq/policy29/amd_pstate_prefcore_ranking:1
> 66
> /sys/devices/system/cpu/cpufreq/policy30/amd_pstate_prefcore_ranking:1
> 76
> /sys/devices/system/cpu/cpufreq/policy31/amd_pstate_prefcore_ranking:1
> 81
> ```
>
> When I run `dd if=/dev/zero of=/dev/null`, the load lands onto cores 3, 5, 19
> or 21, IOW, those that have the highest `amd_pstate_prefcore_ranking`
> value given `schedutil` is in use.
>
> If all of the above is as expected, please add:
>
> Tested-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
>
> > > Would you also be able to Cc me on the next iteration of this patchset?
> > [Meng, Li (Jassmine)] OK.
>
> Thanks.
>
[Meng, Li (Jassmine)]
Thanks a lot.
Based on Wyes's suggestion, I also made similar modifications in the next patches.
All the log information above is in line with expectations.

> > >
> > > Thank you!
> > >
> > > [1] https://lore.kernel.org/lkml/5973628.lOV4Wx5bFT@xxxxxxxxxxxxxx/
> > >
> > > >
> > > > Changes form V6->V7:
> > > > - x86:
> > > > - - Modify kconfig about X86_AMD_PSTATE.
> > > > - cpufreq: amd-pstate:
> > > > - - modify incorrect comments about scheduler_work().
> > > > - - convert highest_perf data type.
> > > > - - modify preferred core init when cpu init and online.
> > > > - acpi: cppc:
> > > > - - modify link of CPPC highest performance.
> > > > - cpufreq:
> > > > - - modify link of CPPC highest performance changed.
> > > >
> > > > Changes form V5->V6:
> > > > - cpufreq: amd-pstate:
> > > > - - modify the wrong tag order.
> > > > - - modify warning about hw_prefcore sysfs attribute.
> > > > - - delete duplicate comments.
> > > > - - modify the variable name cppc_highest_perf to prefcore_ranking.
> > > > - - modify judgment conditions for setting highest_perf.
> > > > - - modify sysfs attribute for CPPC highest perf to pr_debug message.
> > > > - Documentation: amd-pstate:
> > > > - - modify warning: title underline too short.
> > > >
> > > > Changes form V4->V5:
> > > > - cpufreq: amd-pstate:
> > > > - - modify sysfs attribute for CPPC highest perf.
> > > > - - modify warning about comments
> > > > - - rebase linux-next
> > > > - cpufreq:
> > > > - - Moidfy warning about function declarations.
> > > > - Documentation: amd-pstate:
> > > > - - align with ``amd-pstat``
> > > >
> > > > Changes form V3->V4:
> > > > - Documentation: amd-pstate:
> > > > - - Modify inappropriate descriptions.
> > > >
> > > > Changes form V2->V3:
> > > > - x86:
> > > > - - Modify kconfig and description.
> > > > - cpufreq: amd-pstate:
> > > > - - Add Co-developed-by tag in commit message.
> > > > - cpufreq:
> > > > - - Modify commit message.
> > > > - Documentation: amd-pstate:
> > > > - - Modify inappropriate descriptions.
> > > >
> > > > Changes form V1->V2:
> > > > - acpi: cppc:
> > > > - - Add reference link.
> > > > - cpufreq:
> > > > - - Moidfy link error.
> > > > - cpufreq: amd-pstate:
> > > > - - Init the priorities of all online CPUs
> > > > - - Use a single variable to represent the status of preferred core.
> > > > - Documentation:
> > > > - - Default enabled preferred core.
> > > > - Documentation: amd-pstate:
> > > > - - Modify inappropriate descriptions.
> > > > - - Default enabled preferred core.
> > > > - - Use a single variable to represent the status of preferred core.
> > > >
> > > > Meng Li (7):
> > > > x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion.
> > > > acpi: cppc: Add get the highest performance cppc control
> > > > cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
> > > > cpufreq: Add a notification message that the highest perf has changed
> > > > cpufreq: amd-pstate: Update amd-pstate preferred core ranking
> > > > dynamically
> > > > Documentation: amd-pstate: introduce amd-pstate preferred core
> > > > Documentation: introduce amd-pstate preferrd core mode kernel
> > > command
> > > > line options
> > > >
> > > > .../admin-guide/kernel-parameters.txt | 5 +
> > > > Documentation/admin-guide/pm/amd-pstate.rst | 59 +++++-
> > > > arch/x86/Kconfig | 5 +-
> > > > drivers/acpi/cppc_acpi.c | 13 ++
> > > > drivers/acpi/processor_driver.c | 6 +
> > > > drivers/cpufreq/amd-pstate.c | 186 ++++++++++++++++--
> > > > drivers/cpufreq/cpufreq.c | 13 ++
> > > > include/acpi/cppc_acpi.h | 5 +
> > > > include/linux/amd-pstate.h | 10 +
> > > > include/linux/cpufreq.h | 5 +
> > > > 10 files changed, 285 insertions(+), 22 deletions(-)
> > > >
> > > >
> > >
> > >
> > > --
> > > Oleksandr Natalenko (post-factum)
> >
>
>
> --
> Oleksandr Natalenko (post-factum)