Re: [PATCH] sched/fair: Do not decay new task load on first enqueue

From: Dietmar Eggemann
Date: Tue Sep 27 2016 - 09:48:56 EST


On 23/09/16 15:30, Vincent Guittot wrote:
> Hi Matt,
>
> On 23 September 2016 at 13:58, Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx> wrote:
>> Since commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new
>> tasks") ::last_update_time will be set to a non-zero value in
>> post_init_entity_util_avg(), which leads to p->se.avg.load_avg being
>> decayed on enqueue before the task has even had a chance to run.
>>
>> For a NICE_0 task the sequence of events leading up to this with
>> example load average changes might be,
>>
>> sched_fork()
>> init_entity_runnable_average()
>> p->se.avg.load_avg = scale_load_down(se->load.weight); // 1024
>>
>> wake_up_new_task()
>> post_init_entity_util_avg()
>> attach_entity_load_avg()
>> p->se.last_update_time = cfs_rq->avg.last_update_time;
>>
>> activate_task()
>> enqueue_task()
>> ...
>> enqueue_entity_load_avg()
>> migrated = !sa->last_update_time // false
>> if (!migrated)
>> __update_load_avg()
>> p->se.avg.load_avg = 1002
>
> Does it mean that you can see the perf drop that you mention below
> because load is decayed to 1002 instead of staying to 1024 ?

I think Matt is talking about the fact that the cfs->runnable_load_avg
value is 0 once the hackbench task is initially dequeued.

Without this patch the value of se->avg.load_avg (e.g. both times 1002)
is exactly the same when we add it to cfs_rq->runnable_load_avg in
enqueue_entity_load_avg() and when we subtract it in
dequeue_entity_load_avg(). That's because the initial runtime is short
(~250us on my hikey board).

With this patch we add 1024 and subtract ~1002 which lets
cfs_rq->runnable_load_avg still have a small positive value. This
favours that for the next hackbench task another cpu will be chosen in
(load-based) fork-balance.

>
> 1002 mainly comes from period_contrib being set to 1023 during
> init_entity_runnable_average so any delay longer than 1us between
> attach_entity_load_avg and enqueue_entity_load_avg will trig the decay
> of the load from 1024 to 1002
>

[...]