RE: [PATCH v4 1/6] x86,sched: Add support for frequency invariance

From: Doug Smythies
Date: Mon Dec 23 2019 - 02:47:28 EST


Hi Qais,

Thank you for your follow up.

On 2019.12.19 02:48 Qais Yousef wrote:
> On 11/28/19 14:48, Doug Smythies wrote:
>> Summary: There never was an issue here.
>>
>> Sorry for the noise of this thread, and the resulting waste of time.
>>
>> On 2019.11.26 23:33 Doug Smythies wrote:
>>> On 2019.11.26 07:20 Giovanni Gherdovich wrote:
>>>> On Mon, 2019-11-25 at 21:59 -0800, Doug Smythies wrote:
>>>>> [...]
>>>>> The issue with the schedutil governor not working properly in the 5.4 RC series
>>>>> appears to be hardware dependant.
>>
>> No it 's not.
>>
>> Issues with my Sandy Bridge, i7-2600K, test computer and kernel 5.4
>> seem to be because it is running an older Ubuntu server version,
>> apparently somewhat dependant on cgroup V1 and their cgmanager package.
>> I am unable to remove the package to test further because I do use VMs
>> that seem to depend on it.
>>
>> In the kernel configuration when CONFIG_UCLAMP_TASK_GROUP=y
>> the computer behaves as though the new parameter "cpu.uclamp.min"
>> is set to max rather than 0, but I can not prove it.
>
> I just noticed this. This option shouldn't cause any problem, if it does there
> might be a bug that we need to fix.
>
> So cpu.uclamp.min reads 0 but you think it's not taking effect, correct?

Actually, on the i7-2600K older distro test computer, I couldn't find
cpu.uclamp.min to read its setting. However, yes the behaviour of the governor
was as though that value was set to maximum (read on).

>
> In the quotes above I see 5.4 RC, if you haven't tried this against the final
> 5.4 release, do you mind trying to see if you can reproduce? Trying 5.5-rc2
> would be helpful too if 5.4 fails.

My test setup and baseline distribution versions have changed since November,
when I did those tests. However, I was able to rig up a bootable old ssd
and was able to reproduce the issue with kernel 5.5-rc2. More importantly,
I was to reproduce the issue with the current i7-2600K test computer
(Ubuntu server 20.04 development, upgraded version) and kernel 5.5-rc2.
Note that I have access to another i5-9600K based test computer (Ubuntu
server 20.04 development, fresh install), that does not show this issue.

Detail:

If formatting gets messed up in this e-mail, then the content,
and links to more details, is also here:
http://www.smythies.com/~doug/linux/single-threaded/k54regression/qais.html

CPU frequency scaling driver: intel_pstate, in passive (intel-cpufreq) mode.
CPU frequency scaling governor: various.
CPU Idle driver: intel_idle; Governor: teo.

kernels ("stock", "notset" and "nocgv1"):
stock: CONFIG_UCLAMP_TASK_GROUP=y
notset: # CONFIG_UCLAMP_TASK_GROUP is not set
nocgv1: is "stock" booted with "cgroup_no_v1=all" on the grub kernel command line.

Linux s15 5.5.0-rc2-stock #768 SMP PREEMPT Fri Dec 20 16:19:44 PST 2019 x86_64 x86_64 x86_64 GNU/Linux
Linux s18 5.5.0-rc2-notset #769 SMP PREEMPT Fri Dec 20 18:43:59 PST 2019 x86_64 x86_64 x86_64 GNU/Linux

kernel configuration differences:

doug@s15:~/temp-k-git/linux$ scripts/diffconfig /boot/config-5.5.0-rc2-stock /boot/config-5.5.0-rc2-notset
UCLAMP_TASK_GROUP y -> n
doug@s15:~/temp-k-git/linux$

Test methods used herein are greatly sped up, by switching
to just a couple of PID per seconds samples, instead of
a great many. Also disk I/O is not used, eliminating any
access time related non-repeatability, and saving thrashing
my SSD. Note that several governors had CPU frequency variations
with time, resulting in variability in the PIDs per second number.

There are two tests, the performance metric being
the number of PIDs per second consumed:

test 1:

Dountil terminated:
launch a null program (uses a new PID per call).
Wait for it to finish
Enduntil

test 2:

Dountil terminated:
launch a program with a package of work to do (uses a new PID per call).
Wait for it to finish
Enduntil

The assumed fastest and master reference test run is using the performance governor
and forcing CPU affinity. All other calculations are relative to this result.

Results:

i7-2600K computer booted with Ubuntu server 16.04.6, test 1 only:

Governor kernel
notset stock notset
PID/S ratio PID/S ratio PID/S ratio
schedutil 1650 2.4 3935 1.0 FAIL 1645 2.4
ondemand 2787 1.4 2787 1.4 2787 1.4
performance 3925 1.0 3940 1.0 3940 1.0
conservative2545 1.5 2540 1.5 2530 1.6
powersave 1645 2.4 1655 2.4 1650 2.4
reference 3934 1.0 3917 1.0 3950 1.0

i7-2600K computer booted with Ubuntu server 20.04 dev, test 1:

Governor kernel
stock notset stock notset nocgv1
PID/S ratio PID/S ratio PID/S ratio PID/S ratio PID/S ratio
schedutil 3310 1.1 FAIL 1455 2.4 3250 1.1 FAIL 1465 2.4 3220 1.1 FAIL
ondemand 2510 1.4 2485 1.4 2495 1.4 2490 1.4 2460 1.4
performance 3333 1.1 3254 1.1 3250 1.1 3360 1.0 3220 1.1
conservative2230 1.6 2260 1.5 2280 1.5 2220 1.6 2230 1.6
powersave 1470 2.4 1455 2.4 1460 2.4 1470 2.4 1450 2.4
reference 3521 1.0 3500 1.0 3526 1.0 3500 1.0 3500 1.0

i7-2600K computer booted with Ubuntu server 20.04 dev, test 2:

Governor kernel
stock notset nocgv1
PID/S ratio PID/S ratio PID/S ratio
schedutil 405 1.1 FAIL 177 2.4 405 1.1 FAIL
ondemand 371 1.1 371 1.1 371 1.1
performance 408 1.0 405 1.0 405 1.0
conservative362 1.2 365 1.2 365 1.2
powersave 177 2.4 177 2.4 177 2.4
reference 423 1.0 423 1.0 423 1.0

The "nocgv1" (cgroup_no_v1=all) kernel is of particular interest because
now uclamp variables are available:

root@s15:/sys/fs/cgroup/user.slice# echo "+cpu" > cgroup.subtree_control
root@s15:/sys/fs/cgroup/user.slice# cat cgroup.subtree_control
cpu memory pids
root@s15:/sys/fs/cgroup/user.slice# grep . cpu\.uclamp*
cpu.uclamp.max:max
cpu.uclamp.min:0.00

This is repeatable:
To make the schedutil governor respond as expected thereafter
and until the next re-boot, do this:

# echo 0 > cpu.uclamp.min

Attempts to kick the schedutil governor response via
/sys/devices/system/cpu/intel_pstate/max_perf_pct and
/sys/devices/system/cpu/intel_pstate/min_perf_pct didn't.
Other modifications of the cpu.uclamp.min and max variables also
kick the schedutil governor out of whatever state it was in.

This test was done 5 times:

Re-boot to the nocgv1 (stock + cgroup_no_v1=all) kernel.
set the schedutil governor.
launch test 2 and related monitoring tools.
verify performance governor like behavior.
echo 0 > /sys/fs/cgroup/user.slice/cpu.uclamp.min
verify schedutil governor like behaviour.

... Doug