Re: [PATCH v1 00/10] Optimize sched avgs computation and implement flat util hierarchy

From: Dietmar Eggemann
Date: Thu Sep 01 2016 - 16:58:57 EST


On 29/08/16 02:37, Yuyang Du wrote:
> On Tue, Aug 23, 2016 at 04:39:51PM +0100, Dietmar Eggemann wrote:
>> On 23/08/16 15:45, Vincent Guittot wrote:
>>> On 23 August 2016 at 16:13, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>>> On Tue, Aug 23, 2016 at 03:28:19PM +0200, Vincent Guittot wrote:
>>>>> I still wonder if using a flat util hierarchy is the right solution to
>>>>> solve this problem with utilization and task group. I have noticed
>>>>> exact same issues with load that generates weird task placement
>>>>> decision and i think that we should probably try to solve both wrong
>>>>> behavior with same mechanism. but this is not possible with flat
>>>>> hierarchy for load
>>>>>
>>>>> Let me take an example.
>>>>> TA is a always running task on CPU1 in group /root/level1/
>>>>> TB wakes up on CPU0 and moves TA into group /root/level2/
>>>>> Even if TA stays on CPU1, runnable_load_avg of CPU1 root cfs rq will become 0.
>>>>
>>>> Because while we migrate the load_avg on /root/level2, we do not
>>>> propagate the load_avg up the hierarchy?
>>>
>>> yes. At now, the load of a cfs_rq and the load of its sched_entity
>>> that represents it at parent level are disconnected
>>
>> I guess you say 'disconnected' because cfs_rq and se (w/ cfs_rq eq.
>> se->my_q) are now independent pelt signals where as before the rewrite
>> they were 'connected' for load via __update_tg_runnable_avg(),
>> __update_group_entity_contrib() in __update_entity_load_avg_contrib()
>> and for utilization via 'se->avg.utilization_avg_contrib =
>> group_cfs_rq(se)->utilization_load_avg' in
>> __update_entity_utilization_avg_contrib().
>
> I don't understand what exactly "disconnected" means, but with respect to
> group_entity's load_avg, nothing is changed essentially:
>

True but this is the update_cfs_shares() side of things.

> group_entity_load_avg = my_cfs_rq_load_avg / tg_load_avg * tg_shares
>

'Connected' for me in the old implementation stands for the fact that
for every call to update_entity_load_avg(se, 1) (with
!entity_is_task(se)), the group cfs_rq (se->my_q) contribution towards
the se is updated in __update_entity_[load|utilization]_avg_contrib and
the returning delta is added to the appropriate cfs_rq (se->cfs_rq)
values immediately. Doing this in for_each_sched_entity(se) gives this
nice propagation effect in direction root cfs_rq.

In the new implementation se, se->my_q and se->cfs_rq have independent
PELT signals, hence the 'disconnected'.