Re: [PATCH v6 1/4] sched/fair: Fix attaching task sched avgs twice when switching to fair or changing task group
From: Vincent Guittot
Date: Wed Jun 15 2016 - 03:47:19 EST
On 15 June 2016 at 00:21, Yuyang Du <yuyang.du@xxxxxxxxx> wrote:
> Vincent reported that the first task to a new task group's cfs_rq will
> be attached in attach_task_cfs_rq() and once more when it is enqueued
> (see https://lkml.org/lkml/2016/5/25/388).
>
> Actually, it is worse. The sched avgs can be sometimes attached twice
> not only when we change task groups but also when we switch to fair class.
> These two scenarios will be described in the following respectively.
>
> (1) Switch to fair class:
>
> The sched class change is done like this:
>
> if (queued)
> enqueue_task();
> check_class_changed()
> switched_from()
> switched_to()
>
> If the task is on_rq, before switched_to(), it has been enqueued, which
> already attached sched avgs to the cfs_rq if the task's last_update_time
> is 0, which can happen if the task was never fair class, if so, we
> shouldn't attach it again in switched_to(), otherwise, we attach it twice.
>
> To address both the on_rq and !on_rq cases, as well as both the task
> was switched from fair and otherwise, the simplest solution is to reset
> the task's last_update_time to 0 when the task is switched from fair.
> Then let task enqueue do the sched avgs attachment uniformly only once.
>
> (2) Change between fair task groups:
>
> The task groups are changed like this:
>
> if (queued)
> dequeue_task()
> task_move_group()
> if (queued)
> enqueue_task()
>
> Unlike the switch to fair class case, if the task is on_rq, it will be
> enqueued right away after we move task groups, and if not, in the future
> when the task is runnable. The attach twice problem can happen if the
> cfs_rq and the task are both new as Vincent discovered. The simplest
> solution is to only reset the task's last_update_time in task_move_group(),
> and then let enqueue_task() do the sched avgs attachment.
I still have concerned with this change of the behavior that attaches
the task only when it is enqueued. The load avg of the task will not
be decayed between the time we move it into its new group until its
enqueue. With this change, a task's load can stay high whereas it has
slept for the last couple of seconds. Then, its load and utilization
is no more accounted anywhere in the mean time just because we have
moved the task which will be enqueued on the same rq.
A task should always be attached to a cfs_rq and its load/utilization
should always be accounted on a cfs_rq and decayed for its sleep
period
Regards,
Vincent
>
> Reported-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Signed-off-by: Yuyang Du <yuyang.du@xxxxxxxxx>
> ---