[PATCH v3] sched/cfs: make util/load_avg more stable
From: Vincent Guittot
Date: Wed Apr 26 2017 - 02:28:12 EST
In the current implementation of load/util_avg, we assume that the ongoing
time segment has fully elapsed, and util/load_sum is divided by LOAD_AVG_MAX,
even if part of the time segment still remains to run. As a consequence, this
remaining part is considered as idle time and generates unexpected variations
of util_avg of a busy CPU in the range [1002..1024[ whereas util_avg should
stay at 1023.
In order to keep the metric stable, we should not consider the ongoing time
segment when computing load/util_avg but only the segments that have already
fully elapsed. But to not consider the current time segment adds unwanted
latency in the load/util_avg responsivness especially when the time is scaled
instead of the contribution. Instead of waiting for the current time segment
to have fully elapsed before accounting it in load/util_avg, we can already
account the elapsed part but change the range used to compute load/util_avg
accordingly.
At the very beginning of a new time segment, the past segments have been
decayed and the max value is LOAD_AVG_MAX*y. At the very end of the current
time segment, the max value becomes 1024(us) + LOAD_AVG_MAX*y which is equal
to LOAD_AVG_MAX. In fact, the max value is
sa->period_contrib + LOAD_AVG_MAX*y at any time in the time segment.
Taking advantage of the fact that LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024, the
range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib].
As the elapsed part is already accounted in load/util_sum, we update the max
value according to the current position in the time segment instead of
removing its contribution.
Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
---
Changes:
-Correct typo in commit message: s/MAX_LOAD_AVG/LOAD_AVG_MAX/ and square bracket
kernel/sched/fair.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a903276..3531fa1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
/*
* Step 2: update *_avg.
*/
- sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+ sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
if (cfs_rq) {
cfs_rq->runnable_load_avg =
- div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
+ div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
}
- sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
+ sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib);
return 1;
}
--
2.7.4