Re: [PATCH 7/7 v3] sched: fix wrong utilization accounting when switching to fair class

From: Vincent Guittot
Date: Thu Sep 15 2016 - 11:37:31 EST


On 15 September 2016 at 15:18, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, Sep 12, 2016 at 09:47:52AM +0200, Vincent Guittot wrote:
>> When a task switches to fair scheduling class, the period between now and
>> the last update of its utilization is accounted as running time whatever
>> happened during this period. This wrong accounting applies to the task
>> and also to the task group branch.
>>
>> When changing the property of a running task like its list of allowed CPUs
>> or its scheduling class, we follow the sequence:
>> -dequeue task
>> -put task
>> -change the property
>> -set task as current task
>> -enqueue task
>>
>> The end of the sequence doesn't follow the normal sequence which is :
>> -enqueue a task
>> -then set the task as current task.
>>
>> This wrong ordering is the root cause of wrong utilization accounting.
>> Update the sequence to follow the right one:
>> -dequeue task
>> -put task
>> -change the property
>> -enqueue task
>> -set task as current task
>
> But enqueue_entity depends on cfs_rq->curr, which is set by
> set_curr_task_fair().

With this sequence, cfs_rq->curr is null and the cfs_rq is "idle" as
the entity has been dequeued and put back in the rb tree the time to
change the properties.

enqueue_entity use cfs_rq->cur == se for:
- updating current. With this sequence, current is now null so nothing to do
- to skip the enqueue of the se in rb tree. With this sequence, se is
put in the rb tree during the enqueue and take back during the set
task as current task

I don't see any functional issue but we are not doing the same step
with the new sequence

>
> Also, the normalize comment in dequeue_entity() worries me, 'someone'
> didn't update that when he moved update_min_vruntime() around.