Re: [RFC PATCH 08/16] sched/fair: Extend util_est to improve rampup time
From: Dietmar Eggemann
Date: Tue Sep 17 2024 - 15:21:29 EST
On 20/08/2024 18:35, Qais Yousef wrote:
> Utilization invariance can cause big delays. When tasks are running,
> accumulate non-invairiant version of utilization to help tasks to settle
> down to their new util_avg values faster.
>
> Keep track of delta_exec during runnable across activations to help
> update util_est for a long running task accurately. util_est shoudl
> still behave the same at enqueue/dequeue.
>
> Before this patch the a busy task tamping up would experience the
> following transitions, running on M1 Mac Mini
[...]
> @@ -4890,16 +4890,20 @@ static inline void util_est_update(struct cfs_rq *cfs_rq,
> if (!sched_feat(UTIL_EST))
> return;
>
> - /*
> - * Skip update of task's estimated utilization when the task has not
> - * yet completed an activation, e.g. being migrated.
> - */
> - if (!task_sleep)
> - return;
> -
> /* Get current estimate of utilization */
> ewma = READ_ONCE(p->se.avg.util_est);
>
> + /*
> + * If a task is running, update util_est ignoring utilization
> + * invariance so that if the task suddenly becomes busy we will rampup
> + * quickly to settle down to our new util_avg.
> + */
> + if (!task_sleep) {
> + ewma &= ~UTIL_AVG_UNCHANGED;
> + ewma = approximate_util_avg(ewma, p->se.delta_exec / 1000);
> + goto done;
> + }
> +
Can you not use the UTIL_EST_FASTER idea for that? I mean speed up
ramp-up on little CPUs for truly ramp-up tasks to fight the influence of
invariant util_avg->util_est here.
https://lkml.kernel.org/r/Y2kLA8x40IiBEPYg@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I do understand that runnable_avg boosting wont help here since we're
not fighting contention.
It uses the sum of all activations since wake-up so it should be faster
than just using the last activation.
It uses existing infrastructure: __accumulate_pelt_segments()
If you use it inside task- and/or cpu-util function, you don't need to
make util_est state handling more complicated (distinguish periodic and
ramp-up task, including PATCH 09/16).
>From your workload analysis, do you have examples of Android tasks which
are clearly ramp-up tasks and maybe also affine to the little CPUs
(thanks to Android BACKGROUND group) which would require this correction
of the invariant util_avg->util_est signals?
[...]