Re: broken behavior in cfs when moving threads between cgroups

From: Mike Galbraith
Date: Tue Sep 28 2010 - 04:17:47 EST


On Mon, 2010-09-27 at 23:07 -0700, Dima Zavin wrote:

> I'm not really sure how to fix #1 cleanly, since we don't have enough
> information (i.e. we don't know the previous group) inside
> moved_group_fair() to adjust the value. It has to be adjusted in
> sched_move_task(), but I don't see the right interface to do that. I
> included a hacky patch further down that accesses the sched_fair internals
> directly, which I know is wrong but illustrates what I think needs to be done.

I don't really see the relevance of an entity's lag in it's previous
group, so would tend toward entry at parity.

Maybe the move _should_ be treated as a fork, for the same reason we do
START_DEBIT, but if so, seems to me it's irrelevant whether the task is
currently sleeping or not, it's going to run for the first time in it's
new home just as any runnable task, and may do so very soon. OTOH,
frequent moves with that hefty START_DEBIT price tag could hurt very
badly, and makes caring about previous lag seem pointless.

-Mike

(untested)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 9b5b4f8..4dc6b2f 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -3827,10 +3827,14 @@ static void set_curr_task_fair(struct rq *rq)
static void moved_group_fair(struct task_struct *p, int on_rq)
{
struct cfs_rq *cfs_rq = task_cfs_rq(p);
+ struct sched_entity *se = &p->se;
+ u64 vruntime = 0;

update_curr(cfs_rq);
if (!on_rq)
- place_entity(cfs_rq, &p->se, 1);
+ vruntime = cfs_rq->min_vruntime;
+
+ se->vruntime = vruntime;
}
#endif




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/