Re: [PATCH v7] sched: Consolidate cpufreq updates

From: Anjali K
Date: Mon Nov 25 2024 - 01:32:51 EST

Next message: CK Hu (胡俊光): "Re: [PATCH v7 4/5] media: platform: mediatek: isp: add mediatek ISP3.0 camsv"
Previous message: Randy Dunlap: "Re: [PATCH] linux/dmaengine.h: fix a few kernel-doc warnings"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 19/10/24 00:02, Anjali K wrote:
>> Do you mind trying schedutil with a reasonable rate_limit_us, too?
> I think the schedutil governor is not working on my system because the cpu
> frequency shoots to the maximum (3.9GHz) even when the system is only 10%
> loaded.
> I ran stress-ng --cpu `nproc` --cpu-load 10.
> The mpstat command shows that the system is 10% loaded:
> 10:55:25 AM CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal %guest %gnice   %idle
> 10:56:50 AM all   10.03    0.00    0.02    0.00    0.18    0.00    0.00    0.00    0.00   89.76
> But cpupower frequency-info showed that the system is at max frequency
> root@ltczz10:~# cpupower frequency-info
> <snipped>
> available cpufreq governors: conservative ondemand performance schedutil
> current policy: frequency should be within 2.30 GHz and 3.90 GHz.
>                   The governor "schedutil" may decide which speed to use
>                   within this range.
> current CPU frequency: 3.90 GHz (asserted by call to hardware)
> <snipped>
> This is not expected, right?
> I will work on finding out why the schedutil governor is not working on
> this system and get back.
Hi, I found that the schedutil governor is working on this system. I
concluded this because when I printed the util parameter passed in
get_next_freq() when running stress-ng --cpu `nproc` --cpu-load 10, the
util parameter was always 1024 ( equal to the cpu capacity) and so the
frequency gets set to the maximum as expected. Adding `--cpu-load-slice 10`
to the stress-ng commandline, I got lower util values and found that the
frequency is being set as per the system load as shown below:

+-------------+------------+------------+
| stress-ng |    avg     | run-to-run |
|   load %    | cpu freq | std dev% |
|             |    (Hz)    |            |
+-------------+------------+------------+
|     10%     |    2.80    |    1.51    |
|     30%     |    3.53    |    2.47    |
|     50%     |    3.70    |    0.01    |
|     70%     |    3.61    |    0.08    |
|     90%     |    3.54    |    0.04    |
+-------------+------------+------------+
Note that the frequency range for this system is 2.3GHz - 3.9Ghz.

The results with the schedutil governor for the same set of benchmarks is
as follows. Each benchmark is run 3 times:
+------------------------------------------------------+--------------------+----------+--------+---------+------------+
|                     Benchmark                        |      Baseline      | Baseline |Baseline|Baseline |Regression% |
|                                                      | (6.10.0-rc1 tip   | + patch |        |+ patch |            |
|                                                      | sched/core)       |          |stdev % | stdev % |            |
+------------------------------------------------------+--------------------+----------+--------+---------+------------+
|Hackbench run duration (sec)                          |         1          |   1.01   | 1.60 | 1.80   |    0.69    |
|Lmbench simple fstat (usec)                           |         1          |   0.99   | 0.40 | 0.07   |   -0.66    |
|Lmbench simple open/close (usec)                      |         1          |   0.99   | 0.01 | 0.04   |   -0.51    |
|Lmbench simple read (usec)                            |         1          |   1      | 0.23 | 0.41   |   -0.05    |
|Lmbench simple stat (usec)                            |         1          |   0.98   | 0.13 | 0.03   |   -1.54    |
|Lmbench simple syscall (usec)                         |         1          |   0.99   | 0.89 | 0.69   |   -0.59    |
|Lmbench simple write (usec)                           |         1          |   1      | 0.27 | 0.80   |    0       |
|Unixbench execl throughput (lps)                      |         1          |   1      | 0.44 | 0.13   |    0.17    |
|Unixbench Process Creation (lps)                      |         1          |   0.99   | 0.11 | 0.13   |   -0.68    |
|Unixbench Shell Scripts (1 concurrent) (lpm)          |         1          |   1      | 0.07 | 0.05   |    0.03    |
|Unixbench Shell Scripts (8 concurrent) (lpm)          |         1          |   1      | 0.05 | 0.11   |   -0.13    |
+------------------------------------------------------+--------------------+----------+--------+---------+------------+
I did not see any significant improvements/regressions on applying the patch.
I ignored the Stress-ng and Unixbench Pipebased Context Switching
benchmarks as they showed high run-to-run variation with the schedutil
governor (without applying the patch) of 10.68% and 12.5% respectively.

Thank you,
Anjali K

Next message: CK Hu (胡俊光): "Re: [PATCH v7 4/5] media: platform: mediatek: isp: add mediatek ISP3.0 camsv"
Previous message: Randy Dunlap: "Re: [PATCH] linux/dmaengine.h: fix a few kernel-doc warnings"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]