On Mon, Apr 15, 2024 at 09:23:10PM -0700, Vanshidhar Konda wrote:
On Fri, Apr 05, 2024 at 02:33:19PM +0100, Beata Michalska wrote:There has been a discussion on scaling_cur_freq vs cpuinfo_cur_freq [1].
> Some architectures provide a way to determine an average frequency over
> a certain period of time based on available performance monitors (AMU on
> ARM or APERF/MPERf on x86). With those at hand, enroll arch_freq_get_on_cpu
> into cpuinfo_cur_freq policy sysfs attribute handler, which is expected to
> represent the current frequency of a given CPU, as obtained by the hardware.
> This is the type of feedback that counters do provide.
>
--- snip ---
While testing this patch series on AmpereOne system, I simulated CPU
frequency throttling when the system is under power or thermal
constraints.
In this scenario, based on the user guilde, I expect scaling_cur_freq
is the frequency the kernel requests from the hardware; cpuinfo_cur_freq
is the actual frequency that the hardware is able to run at during the
power or thermal constraints.
The guidelines you are referring here (assuming you mean [2]) are kinda
out-of-sync already as scaling_cur_freq has been wired earlier to use arch
specific feedback. As there was no Arm dedicated implementation of
arch_freq_get_on_cpu, this went kinda unnoticed.
The conclusion of the above mentioned discussion (though rather unstated
explicitly) was to keep the current behaviour of scaling_cur_freq and align
both across different archs: so with the patches, both attributes will provide
hw feedback on current frequency, when available.
Note that if we are to adhere to the docs cpuinfo_cur_freq is the place to use
the counters really.
That change was also requested through [3]
Adding @Viresh in case there was any shift in the tides ....
So if I got it right from the info you have provided the numbers above are
The AmpereOne system I'm testing on has the following configuration:
- Max frequency is 3000000
- Support for AMU registers
- ACPI CPPC feedback counters use PCC register space
- Fedora 39 with 6.7.5 kernel
- Fedora 39 with 6.9.0-rc3 + this patch series
With 6.7.5 kernel:
Core scaling_cur_freq cpuinfo_cur_freq
---- ---------------- ----------------
0 3000000 2593000
1 3000000 2613000
2 3000000 2625000
3 3000000 2632000
obtained without applying the patches. In that case, scaling_cur_freq will
use policy->cur (in your case) showing last frequency set, not necessarily
the actual freq, whereas cpuinfo_cur_freq uses __cpufreq_get and AMU counters.
With 6.9.0-rc3 + this patch series:With the patches applied both scaling_cur_freq and cpuinfo_cur_freq will use AMU
Core scaling_cur_freq cpuinfo_cur_freq
---- ---------------- ----------------
0 2671875 2671875
1 2589632 2589632
2 2648437 2648437
3 2698242 2698242
counters, or fie scale factor obtained based on AMU counters to be more precise:
both should now show similar/same frequency (as discussed in [1])
I'd say due to existing implementation for scaling_cur_freq (which we cannot
change at this point) this is unavoidable.
In the second case we can't identify that the CPU frequency isI am not entirely sure comparing the two should be a way to go about throttling
being throttled by the hardware. I noticed this behavior with
or without this patch.
(whether w/ or w/o the changes).
It would probably be best to refer to thermal sysfs and get a hold of cur_state
which should indicate current throttle state:
/sys/class/thermal/thermal_zone[0-*]/cdev[0-*]/cur_state
with values above '0' implying ongoing throttling.
The appropriate thermal_zone can be identified through 'type' attribute.
Thank you for giving those patches a spin.
---
BR
Beata
---
[1] https://lore.kernel.org/all/20230609043922.eyyqutbwlofqaddz@vireshk-i7/
[2] https://elixir.bootlin.com/linux/latest/source/Documentation/admin-guide/pm/cpufreq.rst#L197
[3] https://lore.kernel.org/lkml/2cfbc633-1e94-d741-2337-e1b0cf48b81b@xxxxxxxxxx/
---
Thanks,
Vanshi