Re: [RFC PATCH 0/7] sched: cpufreq: Remove magic margins

From: Lukasz Luba
Date: Thu Sep 07 2023 - 11:53:55 EST




On 9/7/23 15:29, Peter Zijlstra wrote:
On Thu, Sep 07, 2023 at 02:57:26PM +0100, Lukasz Luba wrote:


On 9/7/23 14:26, Peter Zijlstra wrote:
On Wed, Sep 06, 2023 at 10:18:50PM +0100, Qais Yousef wrote:

This is probably controversial statement. But I am not in favour of util_est.
I need to collect the data, but I think we're better with 16ms PELT HALFLIFE as
default instead. But I will need to do a separate investigation on that.

I think util_est makes perfect sense, where PELT has to fundamentally
decay non-running / non-runnable tasks in order to provide a temporal
average, DVFS might be best served with a termporal max filter.



Since we are here...
Would you allow to have a configuration for
the util_est shifter: UTIL_EST_WEIGHT_SHIFT ?

I've found other values than '2' better in some scenarios. That helps
to prevent a big task to 'down' migrate from a Big CPU (1024) to some
Mid CPU (~500-700 capacity) or even Little (~120-300).

Larger values, I'm thinking you're after? Those would cause the new
contribution to weight less, making the function more smooth, right?

Yes, more smooth, because we only use the 'ewma' goodness for decaying
part (not the raising [1]).


What task characteristic is tied to this? That is, this seems trivial to
modify per-task.

In particular Speedometer test and the main browser task, which reaches
~900util, but sometimes vanish and waits for other background tasks
to do something. In the meantime it can decay and wake-up on
Mid/Little (which can cause a penalty to score up to 5-10% vs. if
we pin the task to big CPUs). So, a longer util_est helps to avoid
at least very bad down migration to Littles...

[1] https://elixir.bootlin.com/linux/v6.5.1/source/kernel/sched/fair.c#L4442