From a later email ....
Hopefully just tending to round down more would damp it better.
*imbalance = (*imbalance + SCHED_LOAD_SCALE/2) >> SCHED_LOAD_SHIFT;
Or even remove the addition all together.
I'd side with just removing the addition alltogether ...
Moreover, as Rick pointed out, it's particularly futile over idle cpus ;-)I don't follow...
If CPU 7 has 1 task, and cpu 8 has 0 tasks, there's an imbalance of 1.
There is no point whatsoever in bouncing that task back and forth
between cpu 7 and 8 - it just makes things slower, and trashes the cache.
There's *no* fairness issue here.
If CPU 8 has 2 tasks, and cpu 1 has 1 task, there's an imbalance of 1.
*If* that imbalance persists (and it probably won't, given tasks being
created, destroyed, and blocking for IO), we may want to rotate that to 1 vs 2, and then back to 2 vs 1, etc. in the interests of fairness,
even though it's slower throughput overall.