Re: [RFC PATCH v2 1/2] sched/fair: Introduce UTIL_FITS_CAPACITY feature (v2)

From: Dietmar Eggemann
Date: Tue Oct 24 2023 - 11:03:38 EST


On 24/10/2023 08:10, Chen Yu wrote:
> On 2023-10-23 at 11:04:49 -0400, Mathieu Desnoyers wrote:
>> On 2023-10-23 10:11, Dietmar Eggemann wrote:
>>> On 19/10/2023 18:05, Mathieu Desnoyers wrote:

[...]

>>> Or like find_energy_efficient_cpu() (feec(), used in
>>> Energy-Aware-Scheduling (EAS)) which uses cpu_util(cpu, p, cpu, 0) to get:
>>>
>>> max(util_avg(CPU + p), util_est(CPU + p))
>>
>> I've tried using cpu_util(), but unfortunately anything that considers
>> blocked/sleeping tasks in its utilization total does not work for my
>> use-case.
>>
>> From cpu_util():
>>
>> * CPU utilization is the sum of running time of runnable tasks plus the
>> * recent utilization of currently non-runnable tasks on that CPU.
>>
>
> I thought cpu_util() indicates the utilization decay sum of task that was once
> "running" on this CPU, but will not sum up the "util/load" of the blocked/sleeping
> task?

cpu_util() here refers to:

cpu_util(int cpu, struct task_struct *p, int dst_cpu, int boost)

which when called with (cpu, p, cpu, 0) and task_cpu(p) != cpu returns:

max(util_avg(CPU + p), util_est(CPU + p))

The term `CPU utilization` in cpu_util()'s header stands for
cfs_rq->avg.util_avg.

It does not sum up the utilization of blocked tasks but it can contain
it. They have to be a blocked tasks and not tasks which were running in
cfs_rq since we subtract utilization of tasks which are migrating away
from the cfs_rq (cfs_rq->removed.util_avg in remove_entity_load_avg()
and update_cfs_rq_load_avg()).
> accumulate_sum()
> /* only the running task's util will be sum up */
> if (running)
> sa->util_sum += contrib << SCHED_CAPACITY_SHIFT;
>
> WRITE_ONCE(sa->util_avg, sa->util_sum / divider);

__update_load_avg_cfs_rq()

___update_load_sum(..., cfs_rq->curr != NULL
^^^^^^^^^^^^^^^^^^^^
running
accumulate_sum()

if (periods)
/* decay _sum */
sa->util_sum = decay_load(sa->util_sum, ...)

if (load)
/* decay and accrue _sum */
contrib = __accumulate_pelt_segments(...)

When crossing periods we decay the old _sum and when additionally load
!= 0 we decay and accrue the new _sum as well.