Re: [PATCH 3/8] sched,fair: redefine runnable_load_avg as the sum of task_h_load

From: Dietmar Eggemann
Date: Wed Jun 26 2019 - 10:34:41 EST


On 6/12/19 9:32 PM, Rik van Riel wrote:
> The runnable_load magic is used to quickly propagate information about
> runnable tasks up the hierarchy of runqueues. lhen switching to a flat

Looks like here is some information missing.

> runqueue, that no longer works.
>
> Redefine the CPU cfs_rq runnable_load_avg to be the sum of task_h_loads
> of the runnable tasks. This provides enough information to the load
> balancer.
>
> The runnable_load_avg of the cgroup cfs_rqs does not appear to be
> used for anything, so don't bother calculating those.
>
> This removes one of the things that the code currently traverses the
> cgroup hierarchy for, and getting rid of it brings us one step closer
> to a flat runqueue for the CPU controller.
>
> Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
> ---
> include/linux/sched.h | 3 +-
> kernel/sched/core.c | 2 -
> kernel/sched/debug.c | 1 +
> kernel/sched/fair.c | 125 +++++++++++++-----------------------------
> kernel/sched/pelt.c | 49 ++++++-----------
> kernel/sched/sched.h | 6 --
> 6 files changed, 55 insertions(+), 131 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 11837410690f..f5bb6948e40c 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -391,7 +391,6 @@ struct util_est {
> struct sched_avg {
> u64 last_update_time;
> u64 load_sum;
> - u64 runnable_load_sum;
> u32 util_sum;
> u32 period_contrib;
> unsigned long load_avg;

Could you not also remove runnable_load_avg from struct sched_avg and
put it into the struct cfs_rq directly. The signal has nothing to to
with PELT anymore and se don't have to carry it. You only need it for
the root cfs_rq's but it's at least better than having it still for all
se's as well.

[...]

> @@ -2767,20 +2765,39 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
> static inline void
> enqueue_runnable_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> - cfs_rq->runnable_weight += se->runnable_weight;
> + if (entity_is_task(se)) {
> + struct cfs_rq *cpu_cfs_rq = &cfs_rq->rq->cfs;

There are a couple of comments in fair.c referring to this cfs_rq as the
root cfs_rq, rather the cpu cfs_rq. IMHO, easier to read if we stick to
one name (root_cfs_rq vs. cpu_cfs_rq).

[...]