Re: [RFC PATCH 0/7] sched: cpufreq: Remove magic margins

From: Dietmar Eggemann
Date: Tue Sep 12 2023 - 13:18:37 EST


On 08/09/2023 16:07, Qais Yousef wrote:
> On 09/08/23 09:40, Dietmar Eggemann wrote:
>> On 08/09/2023 02:17, Qais Yousef wrote:
>>> On 09/07/23 15:08, Peter Zijlstra wrote:
>>>> On Mon, Aug 28, 2023 at 12:31:56AM +0100, Qais Yousef wrote:

[...]

>>> And what was a high end A78 is a mid core today. So if you look at today's
>>> mobile world topology we really have a tiy+big+huge combination of cores. The
>>> bigs are called mids, but they're very capable. Fits capacity forces migration
>>> to the 'huge' cores too soon with that 80% margin. While the 80% might be too
>>> small for the tiny ones as some workloads really struggle there if they hang on
>>> for too long. It doesn't help that these systems ship with 4ms tick. Something
>>> more to consider changing I guess.
>>
>> If this is the problem then you could simply make the margin (headroom)
>> a function of cpu_capacity_orig?
>
> I don't see what you mean. instead of capacity_of() but keep the 80%?
>
> Again, I could be delusional and misunderstanding everything, but what I really
> see fits_capacity() is about is misfit detection. But a task is not really
> misfit until it actually has a util above the capacity of the CPU. Now due to
> implementation details there can be a delay between the task crossing this
> capacity and being able to move it. Which what I believe this headroom is
> trying to achieve.
>
> I think we can better define this by tying this headroom to the worst case
> scenario it takes to actually move this misfit task to the right CPU. If it can
> continue to run without being impacted with this delay and crossing the
> capacity of the CPU it is on, then we should not trigger misfit IMO.


Instead of:

fits_capacity(unsigned long util, unsigned long capacity)

return approximate_util_avg(util, TICK_USEC) < capacity;

just make 1280 in:

#define fits_capacity(cap, max) ((cap) * 1280 < (max) * 1024)

dependent on cpu's capacity_orig or the capacity diff to the next higher
capacity_orig.

Typical example today: {little-medium-big capacity_orig} = {128, 896, 1024}

896÷128 = 7

1024/896 = 1.14

to achieve higher margin on little and lower margin on medium.

[...]