Re: [PATCH 2/3] sched/fair: Correctly insert cfs_rq's to list on unthrottle

From: Odin Ugedal
Date: Fri May 28 2021 - 11:07:22 EST


Hi,

> What would be the other condition in addition to the current one
> :cfs_rq->nr_running >= 1 ?

The condition is that if it has load, we should add it (I don't have
100% control on util_avg and runnable_avg tho.). Using
"!cfs_rq_is_decayed()" is another way, but imo. that is a bit
overkill.

> We need to add a cfs_rq in the list if it still contributes to the
> tg->load_avg and the split of the share. Can't we add a condition for
> this instead of adding a new field ?

Yes, using cfs_rq->tg_load_avg_contrib as below would also work the
same way. I still think being explicit that we insert it if we have
removed it is cleaner in a way, as it makes it consistent with the
other use of list_add_leaf_cfs_rq() and list_del_leaf_cfs_rq(), but
that is about preference I guess. I do however think that using
tg_load_avg_contrib will work just fine, as it should always be
positive in case the cfs_rq has some load. I can resent v2 of this
patch using this instead;

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ad7556f99b4a..969ae7f930f5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4720,7 +4720,7 @@ static int tg_unthrottle_up(struct task_group
*tg, void *data)
cfs_rq->throttled_clock_task;

/* Add cfs_rq with already running entity in the list */
- if (cfs_rq->nr_running >= 1)
+ if (cfs_rq->tg_load_avg_contrib)
list_add_leaf_cfs_rq(cfs_rq);
}