[RFC][PATCH 12/14] sched/fair: Cure calc_cfs_shares() vs reweight_entity()

From: Peter Zijlstra
Date: Fri May 12 2017 - 13:23:21 EST


Vincent reported that when running in a cgroup, his root
cfs_rq->avg.load_avg dropped to 0 on task idle.

This is because reweight_entity() will now immediately propagate the
weight change of the group entity to its cfs_rq, and as it happens,
our approxmation (5) for calc_cfs_shares() results in 0 when the group
is idle.

Avoid this by using the correct (3) as a lower bound on (5). This way
the empty cgroup will slowly decay instead of instantly drop to 0.

Reported-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/sched/fair.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2703,11 +2703,10 @@ static long calc_cfs_shares(struct cfs_r
tg_shares = READ_ONCE(tg->shares);

/*
- * This really should be: cfs_rq->avg.load_avg, but instead we use
- * cfs_rq->load.weight, which is its upper bound. This helps ramp up
- * the shares for small weight interactive tasks.
+ * Because (5) drops to 0 when the cfs_rq is idle, we need to use (3)
+ * as a lower bound.
*/
- load = scale_load_down(cfs_rq->load.weight);
+ load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);

tg_weight = atomic_long_read(&tg->load_avg);