Re: [PATCH v6 1/4] sched/fair: Fix attaching task sched avgs twice when switching to fair or changing task group

From: Yuyang Du
Date: Fri Jun 17 2016 - 06:09:42 EST


On Thu, Jun 16, 2016 at 11:21:55PM +0200, Vincent Guittot wrote:
> On 16 June 2016 at 22:07, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Thu, Jun 16, 2016 at 09:00:57PM +0200, Vincent Guittot wrote:
> >> On 16 June 2016 at 20:51, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >> > On Thu, Jun 16, 2016 at 06:30:13PM +0200, Vincent Guittot wrote:
> >> >> With patch [1] for the init of cfs_rq side, all use cases will be
> >> >> covered regarding the issue linked to a last_update_time set to 0 at
> >> >> init
> >> >> [1] https://lkml.org/lkml/2016/5/30/508
> >> >
> >> > Aah, wait, now I get it :-)
> >> >
> >> > Still, we should put cfs_rq_clock_task(cfs_rq) in it, not 1. And since
> >> > we now acquire rq->lock on init this should well be possible. Lemme sort
> >> > that.
> >>
> >> yes with the rq->lock we can use cfs_rq_clock_task which is make more
> >> sense than 1.
> >> But the delta can be still significant between the creation of the
> >> task group and the 1st task that will be attach to the cfs_rq
> >
> > Ah, I think I've spotted more fail.
> >
> > And I think you're right, it doesn't matter, in fact, 0 should have been
> > fine too!
> >
> > enqueue_entity()
> > enqueue_entity_load_avg()
> > update_cfs_rq_load_avg()
> > now = clock()
> > __update_load_avg(&cfs_rq->avg)
> > cfs_rq->avg.last_load_update = now
> > // ages 0 load/util for: now - 0
> > if (migrated)
> > attach_entity_load_avg()
> > se->avg.last_load_update = cfs_rq->avg.last_load_update; // now != 0
> >
> > So I don't see how it can end up being attached again.
>
> In fact it has already been attached during the sched_move_task. The
> sequence for the 1st task that is attached to a cfs_rq is :
>
> sched_move_task()
> task_move_group_fair()
> detach_task_cfs_rq()
> set_task_rq()
> attach_task_cfs_rq()
> attach_entity_load_avg()
> se->avg.last_load_update = cfs_rq->avg.last_load_update == 0
>

Then again, does this fix it?

static void task_move_group_fair(struct task_struct *p)
{
detach_task_cfs_rq(p);
set_task_rq(p, task_cpu(p));
attach_task_cfs_rq(p);
/*
* If the cfs_rq's last_update_time is 0, attach the sched avgs
* won't be anything useful, as it will be decayed to 0 when any
* sched_entity is enqueued to that cfs_rq.
*
* On the other hand, if the cfs_rq's last_update_time is 0, we
* must reset the task's last_update_time to ensure we will attach
* the sched avgs when the task is enqueued.
*/
if (!cfs_rq_of(&p->se)->avg.last_update_time)
reset_task_last_update_time(p);
else
attach_entity_load_avg(cfs_rq_of(&p->se), &p->se);
}