Re: [PATCH] sched/fair: Sync se's load_avg with cfs_rq in reweight_entity

From: Chengming Zhou
Date: Wed Jul 17 2024 - 07:19:03 EST


On 2024/7/16 23:08, Chuyi Zhou wrote:
In reweight_entity(), if !se->on_rq (e.g. when we are reweighting a
sleeping task), we should sync the load_avg of se to cfs_rq before calling
dequeue_load_avg(). Otherwise, the load_avg of this se can be inaccurate.

Good catch!


Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
kernel/sched/fair.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9057584ec06d..2807f6e72ad1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3782,6 +3782,8 @@ static void reweight_eevdf(struct sched_entity *se, u64 avruntime,
se->deadline = avruntime + vslice;
}
+static inline void update_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags);
+
static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
unsigned long weight)
{
@@ -3795,7 +3797,11 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
if (!curr)
__dequeue_entity(cfs_rq, se);
update_load_sub(&cfs_rq->load, se->load.weight);
+ } else {
+ /* Sync with the cfs_rq before removing our load_avg */
+ update_load_avg(cfs_rq, se, 0);

I think it's suboptimal to update_load_avg() here unconditionally.

Because reweight_entity() has two types of usages:

1. group se, which uses reweight_entity() in update_cfs_group(), which
should already update_load_avg(), so should have no problem.

2. task se, which uses reweight_entity() in reweight_task(), which should be fixed for sleep task entity as you described above.

So IMHO, we should only update_load_avg() or sync_entity_load_avg() in
reweight_task(), right?

}
+
dequeue_load_avg(cfs_rq, se);
if (se->on_rq) {