Re: [PATCH] cpufreq: stats: Switch to ktime and msec instead of jiffies and usertime

From: Rafael J. Wysocki
Date: Tue Nov 10 2020 - 07:59:57 EST


On Tue, Nov 10, 2020 at 12:07 PM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>
> The cpufreq and thermal core, both provide sysfs statistics to help
> userspace learn about the behavior of frequencies and cooling states.
>
> This is how they look:
>
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:208000 11
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:432000 147
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:729000 1600
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:960000 879
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:1200000 399
>
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state0 4097
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state1 8932
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state2 15868
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state3 1384
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state4 103
>
> Here, state0 of thermal corresponds to the highest frequency of the CPU,
> i.e. 1200000 and state4 to the lowest one.
>
> While both of these try to show similar kind of data (which can still be
> very much different from each other), the values looked different (by a
> factor of 10, i.e. thermal's time_in_state is almost 10 times that of
> cpufreq time_in_state).
>
> This comes from the fact that cpufreq core displays the time in usertime
> units (10 ms).
>
> It would be better if both the frameworks displayed times in the same
> unit as the users may need to correlate between them and different
> scales just make it awkward. And the choice of thermal core for that
> (msec) seems to be a better choice as it is easier to read.
>
> The thermal core also does the stats calculations using ktime, which is
> much more accurate as compared to jiffies used by cpufreq core.
>
> This patch updates the cpufreq core to use ktime for the internal
> calculations and changes the units of time_in_state to msec.

Well, this may confuse user space using the stats today.

>
> The results look like this after this commit:
>
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:208000 13
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:432000 790
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:729000 12492
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:960000 13259
> /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state:1200000 3830
>
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state0 3888
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state1 13432
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state2 12336
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state3 740
> /sys/class/thermal/cooling_device0/stats/time_in_state_ms:state4 0
>
> FWIW, tools/power/cpupower/ does consume the time_in_state values from
> the sysfs files but it is independent of the unit of the time and didn't
> require an update.

But whoever uses cpupower may be confused.