[PATCH 2/3] sched: Unloop sched avg decaying
From: Frederic Weisbecker
Date: Tue Jun 14 2016 - 11:28:21 EST
The code that computes sched rt avg decaying periodically (every 0.5
seconds) catches up with lost updates within a loop. This is ok when
CPUs sleep in dynticks mode for short periods of time as they only
miss a few updates that are caught up through a few iterations. But CPUs
can sleep for undefined periods of time, leading to unbound number of
loop iterations to catch up. As a result, the computation time of
sched_avg_update() can virtually induce some latency issue.
The solution in this patch propose to convert this loop into a single
division. Testing on idle load has shown positive results. Below
is showned a function profile comparison between the upstream version
(renamed as sched_avg_update_old) against the new version
(sched_avg_update) with both being called alternatively.
New version can be up to 2 times faster on average.
Function Hit Time Avg s^2
-------- --- ---- --- ---
trace_stat/function0: sched_avg_update_old 135120 47612.36 us 0.352 us 1049.606 us
trace_stat/function0: sched_avg_update 135120 42668.44 us 0.315 us 1617.970 us
trace_stat/function1: sched_avg_update_old 132074 46618.41 us 0.352 us 3653.566 us
trace_stat/function1: sched_avg_update 132074 41356.76 us 0.313 us 3161.585 us
trace_stat/function2: sched_avg_update_old 121335 43381.10 us 0.357 us 1262.240 us
trace_stat/function2: sched_avg_update 121335 38240.72 us 0.315 us 970.400 us
trace_stat/function3: sched_avg_update_old 37563 17640.67 us 0.469 us 148.206 us
trace_stat/function3: sched_avg_update 37563 11059.65 us 0.294 us 126.548 us
trace_stat/function4: sched_avg_update_old 24200 13557.05 us 0.560 us 57.124 us
trace_stat/function4: sched_avg_update 24200 6687.281 us 0.276 us 9.528 us
trace_stat/function5: sched_avg_update_old 28572 15158.76 us 0.530 us 674.049 us
trace_stat/function5: sched_avg_update 28573 8012.361 us 0.280 us 181.687 us
trace_stat/function6: sched_avg_update_old 23424 12987.64 us 0.554 us 27.639 us
trace_stat/function6: sched_avg_update 23425 6264.433 us 0.267 us 2.965 us
trace_stat/function7: sched_avg_update_old 22192 13083.69 us 0.589 us 25.605 us
trace_stat/function7: sched_avg_update 22191 5947.785 us 0.268 us 1.748 us
Below is a snapshot of the same function profile under full buzy load
(perf bench sched messaging). Old and new versions there show roughly
the same results with tiny 2 or 3 nsecs in favour of the old version
which shouldn't matter much as it is called every 0.5 seconds.
Function Hit Time Avg s^2
-------- --- ---- --- ---
trace_stat/function0: sched_avg_update 106699 8029.961 us 0.075 us 0.244 us
trace_stat/function0: sched_avg_update_old 106698 7852.948 us 0.073 us 0.313 us
trace_stat/function1: sched_avg_update 106547 8066.833 us 0.075 us 1.256 us
trace_stat/function1: sched_avg_update_old 106547 7794.896 us 0.073 us 1.521 us
trace_stat/function2: sched_avg_update 106527 8049.326 us 0.075 us 1.141 us
trace_stat/function2: sched_avg_update_old 106528 7818.155 us 0.073 us 1.052 us
trace_stat/function3: sched_avg_update 106534 8056.079 us 0.075 us 0.342 us
trace_stat/function3: sched_avg_update_old 106535 7815.416 us 0.073 us 0.369 us
trace_stat/function4: sched_avg_update 106433 8090.462 us 0.076 us 5.359 us
trace_stat/function4: sched_avg_update_old 106433 7879.694 us 0.074 us 0.433 us
trace_stat/function5: sched_avg_update 106426 8127.800 us 0.076 us 1.304 us
trace_stat/function5: sched_avg_update_old 106425 7854.538 us 0.073 us 2.466 us
trace_stat/function6: sched_avg_update 106436 8067.921 us 0.075 us 0.257 us
trace_stat/function6: sched_avg_update_old 106436 7830.492 us 0.073 us 0.334 us
trace_stat/function7: sched_avg_update 106427 8135.786 us 0.076 us 13.181 us
trace_stat/function7: sched_avg_update_old 106428 7940.925 us 0.074 us 0.982 us
As a conclusion, the new version behaves roughly the same under buzy
load but has a faster and much more contained behaviour under idle load.
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
---
kernel/sched/core.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 385c947..0c0578a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -666,17 +666,17 @@ bool sched_can_stop_tick(struct rq *rq)
void sched_avg_update(struct rq *rq)
{
s64 period = sched_avg_period();
+ s64 delta;
+ u64 rem;
+ int pending;
- while ((s64)(rq_clock(rq) - rq->age_stamp) > period) {
- /*
- * Inline assembly required to prevent the compiler
- * optimising this loop into a divmod call.
- * See __iter_div_u64_rem() for another example of this.
- */
- asm("" : "+rm" (rq->age_stamp));
- rq->age_stamp += period;
- rq->rt_avg /= 2;
- }
+ delta = (s64)(rq_clock(rq) - rq->age_stamp);
+ if (delta <= period)
+ return;
+
+ pending = div64_u64_rem(delta, period, &rem);
+ rq->age_stamp += delta - rem;
+ rq->rt_avg >>= pending;
}
#endif /* CONFIG_SMP */
--
2.7.0