Re: [PATCH 4/7] sched/fair: Clean up the logic in fix_small_imbalance()

From: Peter Zijlstra
Date: Tue May 03 2016 - 06:12:44 EST


On Fri, Apr 29, 2016 at 08:32:41PM +0100, Dietmar Eggemann wrote:
> Avoid the need to add scaled_busy_load_per_task on both sides of the if
> condition to determine whether imbalance has to be set to
> busiest->load_per_task or not.
>
> The imbn variable was introduced with commit 2dd73a4f09be ("[PATCH]
> sched: implement smpnice") and the original if condition was
>
> if (max_load - this_load >= busiest_load_per_task * imbn)
>
> which over time changed into the current version where
> scaled_busy_load_per_task is to be found on both sides of
> the if condition.

This appears to have started with:

dd41f596cda0 ("sched: cfs core code")

which for unexplained reasons does:

- if (max_load - this_load >= busiest_load_per_task * imbn) {
+ if (max_load - this_load + SCHED_LOAD_SCALE_FUZZ >=
+ busiest_load_per_task * imbn) {


And later patches (by me) change that FUZZ into a variable metric,
because a fixed fuzz like that didn't at all work for the small loads
that result from cgroup tasks.



Now fix_small_imbalance() always hurt my head; it originated in the
original sched_domain balancer from Nick which wasn't smpnice aware; and
lives on until today.

Its purpose is to determine if moving one task over is beneficial.
However over time -- and smpnice started this -- the idea of _one_ task
became quite muddled.

With the fine grained load accounting of today; does it even make sense
to ask this question? IOW. what does fix_small_imbalance() really gain
us -- other than a head-ache?