On 03-Jul 14:38, Douglas Raillard wrote:
Hi Peter,
On 7/2/19 4:44 PM, Peter Zijlstra wrote:
On Thu, Jun 27, 2019 at 06:15:58PM +0100, Douglas RAILLARD wrote:
Make schedutil cpufreq governor energy-aware.
- patch 1 introduces a function to retrieve a frequency given a base
frequency and an energy cost margin.
- patch 2 links Energy Model perf_domain to sugov_policy.
- patch 3 updates get_next_freq() to make use of the Energy Model.
1) Selecting the highest possible frequency for a given cost. Some
platforms can have lower frequencies that are less efficient than
higher ones, in which case they should be skipped for most purposes.
They can still be useful to give more freedom to thermal throttling
mechanisms, but not under normal circumstances.
note: the EM framework will warn about such OPPs "hertz/watts ratio
non-monotonically decreasing"
Humm, for some reason I was thinking we explicitly skipped those OPPs
and they already weren't used.
This isn't in fact so, and these first few patches make it so?
That's correct, the cost information about each OPP has been introduced recently in mainline
by the energy model series. Without that info, the only way to skip them that comes to my
mind is to set a policy min frequency, since these inefficient OPPs are usually located
at the lower end.
Perhaps it's also worth to point out that the alternative approach you
point out above is a system wide solution.
While, the ramp_boost thingy you propose, it's a more fine grained
mechanisms which could be extended in the future to have a per-task
side. IOW, it could contribute to have better user-space hints, for
example to ramp_boost more certain tasks and not others.
Best,
Patrick