[PATCH v3 3/5] sched: sync a se with prev cfs_rq when changing cgroup

From: byungchul . park
Date: Wed Aug 19 2015 - 02:48:23 EST


From: Byungchul Park <byungchul.park@xxxxxxx>

current code is wrong with cfs_rq's average load when changing a task's
cfs_rq to another. i tested with "echo pid > cgroup" and found that
e.g. cfs_rq->avg.load_avg became larger and larger whenever changing
cgroup to another again and again. we have to sync se's average load
with both *prev* cfs_rq and next cfs_rq when changing its cgroup.

Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
---
kernel/sched/fair.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 191d9be..1be042a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8040,8 +8040,14 @@ static void task_move_group_fair(struct task_struct *p, int queued)
if (!queued && (!se->sum_exec_runtime || p->state == TASK_WAKING))
queued = 1;

+ cfs_rq = cfs_rq_of(se);
if (!queued)
- se->vruntime -= cfs_rq_of(se)->min_vruntime;
+ se->vruntime -= cfs_rq->min_vruntime;
+
+#ifdef CONFIG_SMP
+ /* synchronize task with its prev cfs_rq */
+ detach_entity_load_avg(cfs_rq, se);
+#endif
set_task_rq(p, task_cpu(p));
se->depth = se->parent ? se->parent->depth + 1 : 0;
cfs_rq = cfs_rq_of(se);
--
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/