Re: [PATCH v1 00/10] Optimize sched avgs computation and implement flat util hierarchy

From: Vincent Guittot
Date: Wed Aug 24 2016 - 05:48:30 EST


On 24 August 2016 at 10:54, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> On Tue, Aug 23, 2016 at 04:45:57PM +0200, Vincent Guittot wrote:
>> On 23 August 2016 at 16:13, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Tue, Aug 23, 2016 at 03:28:19PM +0200, Vincent Guittot wrote:
>> >> I still wonder if using a flat util hierarchy is the right solution to
>> >> solve this problem with utilization and task group. I have noticed
>> >> exact same issues with load that generates weird task placement
>> >> decision and i think that we should probably try to solve both wrong
>> >> behavior with same mechanism. but this is not possible with flat
>> >> hierarchy for load
>> >>
>> >> Let me take an example.
>> >> TA is a always running task on CPU1 in group /root/level1/
>> >> TB wakes up on CPU0 and moves TA into group /root/level2/
>> >> Even if TA stays on CPU1, runnable_load_avg of CPU1 root cfs rq will become 0.
>> >
>> > Because while we migrate the load_avg on /root/level2, we do not
>> > propagate the load_avg up the hierarchy?
>>
>> yes. At now, the load of a cfs_rq and the load of its sched_entity
>> that represents it at parent level are disconnected
>>
>> >
>> > And always propagating everyrthing up will indeed also fix the
>> > utilization issue.
>> >
>> > Of course, doing that propagation has its costs..
>>
>> yes, that's the counterpart
>>
>> >
>> > Didn't you post a patch doing just this a while ago?
>>
>> My patch was doing that but only for utilization and i have start to
>> work on adding the propagation of load as well
>
> As Dietmar mentioned already, the 'disconnect' is a feature of the PELT
> rewrite. Paul and Ben's original implementation had full propagation up
> and down the hierarchy. IIRC, one of the key points of the rewrite was
> more 'stable' signals, which we would loose by re-introducing immediate
> updates throughout hierarchy.
>
> It is a significant change to group scheduling, so I'm a bit surprised
> that nobody has observed any problems post the rewrite. But maybe most
> users don't care about the load-balance being slightly off when tasks
> have migrated or new tasks are added to a group.
>
> If we want to re-introduce propagation of both load and utilization I
> would suggest that we just look at the original implementation. It
> seemed to work.

The previous implementation was propagating the changes of load and
utilization of an entity for most of its update whereas we only need
to sync when entity is attached/detached or migrated

>
> Handling utilization and load differently will inevitably result in more
> code. The 'flat hierarchy' approach seems slightly less complicated, but
> it prevents us from using group utilization later should we wish to do
> so. It might for example become useful for the schedutil cpufreq
> governor should it ever consider selecting frequencies differently based
> on whether the current task is in a (specific) group or not.
>
> Morten