[PATCH 5/5] sched: use the old min_vruntime when normalizing on dequeue

From: John Stultz
Date: Fri Nov 19 2010 - 21:10:30 EST


From: Dima Zavin <dima@xxxxxxxxxxx>

After pulling the thread off the run-queue during a cgroup change,
the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
then gets normalized to this new value. This can then lead to the thread
getting an unfair boost in the new group if the vruntime of the next
task in the old run-queue was way further ahead.

CC: Ingo Molnar <mingo@xxxxxxx>
CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Arve HjÃnnevÃg <arve@xxxxxxxxxxx>
Signed-off-by: Dima Zavin <dima@xxxxxxxxxxx>
Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
---
kernel/sched_fair.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f4f6a83..72f19ad 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -802,6 +802,8 @@ static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
static void
dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
{
+ u64 min_vruntime;
+
/*
* Update run-time statistics of the 'current'.
*/
@@ -826,6 +828,8 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
if (se != cfs_rq->curr)
__dequeue_entity(cfs_rq, se);
account_entity_dequeue(cfs_rq, se);
+
+ min_vruntime = cfs_rq->min_vruntime;
update_min_vruntime(cfs_rq);

/*
@@ -834,7 +838,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
* movement in our normalized position.
*/
if (!(flags & DEQUEUE_SLEEP))
- se->vruntime -= cfs_rq->min_vruntime;
+ se->vruntime -= min_vruntime;
}

/*
--
1.7.3.2.146.gca209

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/