Re: [tip:sched/core] sched/fair: Clean up scale confusion

From: Morten Rasmussen
Date: Fri May 20 2016 - 06:12:01 EST


On Fri, May 13, 2016 at 09:23:50AM +0200, Vincent Guittot wrote:
> On 12 May 2016 at 21:42, Yuyang Du <yuyang.du@xxxxxxxxx> wrote:
> > On Thu, May 12, 2016 at 03:31:27AM -0700, tip-bot for Peter Zijlstra wrote:
> >> Commit-ID: 1be0eb2a97d756fb7dd8c9baf372d81fa9699c09
> >> Gitweb: http://git.kernel.org/tip/1be0eb2a97d756fb7dd8c9baf372d81fa9699c09
> >> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> >> AuthorDate: Fri, 6 May 2016 12:21:23 +0200
> >> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> >> CommitDate: Thu, 12 May 2016 09:55:33 +0200
> >>
> >> sched/fair: Clean up scale confusion
> >>
> >> Wanpeng noted that the scale_load_down() in calculate_imbalance() was
> >> weird. I agree, it should be SCHED_CAPACITY_SCALE, since we're going
> >> to compare against busiest->group_capacity, which is in [capacity]
> >> units.
>
> In fact, load_above_capacity is only about load and not about capacity.
>
> load_above_capacity -= busiest->group_capacity is an optimization (may
> be a wronf one) of
> load_above_capacity -= busiest->group_capacity * SCHED_LOAD_SCALE /
> SCHED_CAPACITY_SCALE
>
> so we subtract load to load

I like your approach as you compute the desired minimum load, which is
essentially finding the number of NICE_0_LOAD task we want in the group,
and then determine how much excess load there is. So it becomes quite
clear that it is load.

While it preserves existing behaviour I would question the whole
NICE_0_LOAD assumption. It totally falls apart with PELT and if we have
tasks with nice != 0.

Also, it doesn't address the existing unit issue as load_above_capacity
is later multiplied by busiest->group_capacity when computing the
imbalance. As said in the other thread, we should either kill the
minimum load estimation that assumes always-running NICE_0_LOAD tasks,
or at least make sure the scaling of load_above_capacity is correct.
Patches attempting either solution are in the other thread.