Re: problem in changing from active to passive mode
From: Doug Smythies
Date: Wed Oct 27 2021 - 11:10:24 EST
On Tue, Oct 26, 2021 at 8:13 AM Julia Lawall <julia.lawall@xxxxxxxx> wrote:
>
> The problem is illustrated by the attached graphs. These graphs on the
> odd numbered pages show the frequency of each core measures at every clock
> tick. At each measurement there is a small bar representing 4ms of the
> color associated with the frequency. The percentages shown are thus not
> entirely accurate, because the frequency could change within those 4ms and
> we would not observe that.
>
> The first graph, 5.9schedutil_yeti, is the normal behavior of schedutil
> running. The application mostly uses the second highest turbo mode, which
> is the appropriate one given that there are around 5 active cores most of
> the time. I traced power:cpu_frequency, which is the event that occurs
> when the OS requests a change of frequency. This happens around 5400
> times.
>
> The second graph, 5.15-schedutil_yeti, is the latest version of Linus's
> tree. The cores are almost always at the lowest frequency. There are no
> occurrences of the power:cpu_frequency event.
>
> The third graph, 5.9schedutil_after_yeti, it what happens when I reboot
> into 5.9 after having changed to passive mode in 5.15. The number of
> power:cpu_frequency drops to around 1100. The proper turbo mode is
> actually used sometimes, but much less than in the first graph. More than
> half of the time, an active core is at the lowest frequency.
>
> This application (avrora from the DaCapo benchmarks) is continually
> stopping and starting, both for very short intervals. This may discourage
> the hardware from raising the frequency of its own volition.
Agreed. This type of workflow has long been known to be a challenge
for various CPU frequency scaling governors. It comes up every so
often on the linux-pm email list. Basically, the schedutil CPU frequency
scaling governor becomes somewhat indecisive under these conditions.
However, if for some reason it gets kicked up to max CPU frequency,
then often it will stay there (depending on details of the workflow,
it stays up for my workflows).
Around the time of the commit you referenced in your earlier
email, it was recognised that proposed changes were adding
a bit of a downward bias to the hwp-passive-scheutil case for
some of these difficult workflows [1].
I booted an old 5.9, HWP enabled, passive, schedutil.
I got the following for my ping-pong test type workflow,
(which is not the best example):
Run 1: 6234 uSecs/loop
Run 2: 2813 uSecs/loop
Run 3: 2721 uSecs/loop
Run 4: 2813 uSecs/loop
Run 5: 11303 uSecs/loop
Run 6: 13803 uSecs/loop
Run 7: 2809 uSecs/loop
Run 8: 2796 uSecs/loop
Run 9: 2760 uSecs/loop
Run 10: 2691 uSecs/loop
Run 11: 9288 uSecs/loop
Run 12: 4275 uSecs/loop
Then the same with kernel 5.15-rc5
(I am a couple of weeks behind).
Run 1: 13618 uSecs/loop
Run 2: 13901 uSecs/loop
Run 3: 8929 uSecs/loop
Run 4: 12189 uSecs/loop
Run 5: 10338 uSecs/loop
Run 6: 12846 uSecs/loop
Run 7: 5418 uSecs/loop
Run 8: 7692 uSecs/loop
Run 9: 11531 uSecs/loop
Run 10: 9763 uSecs/loop
Now, for your graph 3, are you saying this pseudo
code of the process is repeatable?:
Power up the system, booting kernel 5.9
switch to passive/schedutil.
wait X minutes for system to settle
do benchmark, result ~13 seconds
re-boot to kernel 5.15-RC
switch to passive/schedutil.
wait X minutes for system to settle
do benchmark, result ~40 seconds
re-boot to kernel 5.9
switch to passive/schedutil.
wait X minutes for system to settle
do benchmark, result ~28 seconds
... Doug
> I also tried
> a simple spin loop (for(;;);) with the 5.15 rc version, and it does go to
> the highest frequency as one would expect. But there are again no
> power:cpu_frequency events.
>
> julia
[1] https://www.spinics.net/lists/kernel/msg3775304.html