Re: [PATCH -v2 12/18] sched/fair: Rewrite PELT migration propagation

From: Vincent Guittot
Date: Tue Oct 10 2017 - 03:45:20 EST


On 10 October 2017 at 09:29, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, Oct 09, 2017 at 05:29:04PM +0200, Vincent Guittot wrote:
>> On 9 October 2017 at 17:03, Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>> > On 1 September 2017 at 15:21, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
>> >> +/*
>> >> + * When on migration a sched_entity joins/leaves the PELT hierarchy, we need to
>> >> + * propagate its contribution. The key to this propagation is the invariant
>> >> + * that for each group:
>> >> + *
>> >> + * ge->avg == grq->avg (1)
>> >> + *
>> >> + * _IFF_ we look at the pure running and runnable sums. Because they
>> >> + * represent the very same entity, just at different points in the hierarchy.
>> >
>> > I agree for the running part because only one entity can be running
>> > but i'm not sure for the pure runnable sum because we can have
>> > several runnable task in a cfs_rq but only one runnable group entity
>> > to reflect them or I misunderstand (1)
>
> The idea is that they (ge and grq) are the _same_ entity, just at
> different levels in the hierarchy. If the grq is runnable, it is through
> the ge.
>
> As a whole, they don't care how many runnable tasks there are.

Ok so I agree with this point that both grp and ge runnable follow the
same trend

my point is that even if same changes apply on both grp and ge, we
can't directly apply changes of grp's runnable on ge's runnable

>
>> > As an example, we have 2 always running task TA and TB so their
>> > load_sum is LOAD_AVG_MAX for each task The grq->avg.load_sum = \Sum
>> > se->avg.load_sum = 2*LOAD_AVG_MAX But the ge->avg.load_sum will be
>> > only LOAD_AVG_MAX
>> >
>> > So If we apply directly the d(TB->avg.load_sum) on the group hierachy
>> > and on ge->avg.load_sum in particular, the latter decreases to 0
>> > whereas it should decrease only by half
>> >
>> > I have been able to see this wrong behavior with a rt-app json file
>> >
>> > so I think that we should instead remove only
>> >
>> > delta = se->avg.load_sum / grq->avg.load_sum * ge->avg.load_sum
>>
>> delta = se->avg.load_sum / (grq->avg.load_sum+se->avg.load_sum) *
>> ge->avg.load_sum
>>
>> as the se has already been detached
>>
>> > We don't have grq->avg.load_sum but we can have a rough estimate with
>> > grq->avg.load_avg/grq->weight
>
> Hurm, I think I see what you're saying, let me ponder this more.

The formula above works was an example for detaching but it doesn't
apply for all use case. we need a more generic way to propagate
runnable changes