Re: [PATCH] sched: move enough load to balance average load per task

From: Siddha, Suresh B
Date: Tue Apr 11 2006 - 22:00:02 EST


On Wed, Apr 12, 2006 at 09:46:32AM +1000, Peter Williams wrote:
> Siddha, Suresh B wrote:
> > On Mon, Apr 10, 2006 at 04:45:32PM +1000, Peter Williams wrote:
> >> Problem:
> >>
> >> The current implementation of find_busiest_group() recognizes that
> >> approximately equal average loads per task for each group/queue are
> >> desirable (e.g. this condition will increase the probability that the
> >> top N highest priority tasks on an N CPU system will be on different
> >> CPUs) by being slightly more aggressive when *imbalance is small but the
> >> average load per task in "busiest" group is more than that in "this"
> >> group. Unfortunately, the amount moved from "busiest" to "this" is too
> >> small to reduce the average load per task on "busiest" (at best there
> >> will be no change and at worst it will get bigger).
> >
> > Peter, We don't need to reduce the average load per task on "busiest"
> > always. By moving a "busiest_load_per_task", we will increase the
> > average load per task of lesser busy cpu (there by trying to achieve
> > the equality with busiest...)
>
> Well, first off, we don't always move busiest_load_per_task we move UP
> TO busiest_load_per_task so there is no way you can make definitive
> statements about what will happen to the value "this_load_per_task" as a
> result of setting *imbalance to busiest_load_per_task. Load balancing
> is a probabilistic endeavour and we need to take steps that increase the
> probability that we get the desired result.

I agree with you. But the previous code was more conservative and may slowly
(just from theory pt of view... I don't have an example to show this..)
balance towards the desired state. With this code, I feel we are
aggressive. for example, on a DP system: if I run one high priority
and two low priority processes, they keep hopping from one processor
to another... you may argue it is because of the "top" or some other
process... I agree that it is the case.. But same thing doesn't happen
with the previous version.. I like the conservative approach...

> Without this patch there is no chance that busiest_load_per_task will
> get smaller

Is there an example for this?

> and whether this_load_per_task will get bigger is
> indeterminate. With this patch there IS a chance that
> busiest_load_per_task will decrease and an INCREASED chance that
> this_load_per_task will get bigger. Ergo we have increased the
> probability that the (absolute) difference between this_load_per_task
> and busiest_load_per_task will decrease. This is a desirable outcome.

All I am saying is we are more aggressive.. I don't have any issue with
the desired outcome..

thanks,
suresh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/