Re: About CPU's Load Balance and CFS functions

From: Ingo Molnar
Date: Tue Sep 08 2009 - 02:55:35 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Mon, 2009-09-07 at 16:14 +0800, lookeylam wrote:
> > Hello:
> > I am not sure this is the right maillist to ask this
> > question. I just have a try.
> > I have a test on Dell 1950 with 8 cpus on board for testing
> > the apache by ab command. And I find that in
> > linux 2.6.18. The processes forked by apache are not well
> > distributed on these 8 cpus.
> > linux 2.6.23 is a little better than 2.6.18, but still some
> > cpus are running busy and some cpus remains idle.
> > While in 2.6.30, these 8 cpus are well used and the
> > percentage of each cpu is nearly the same. And when I
> > start the control group with cpuset type with
> > sched_relax_domain_level( with value 3,4,5). The result of ab is 50ms
> > better than test results without control group.
> >
> > I attribute this situation to to load_balance but not CFS,
> > because CFS is just a scheduler for orgnizing the process inside one
> > cpu, while load_balance is the main character to control the process
> > and load between different cpus.
> > But when i give out this conclusion, I confuse about the
> > differences of these three kernels of load_balance.
> >
> > My questions are the above conclusion is right or not? How
> > would these situation happen and why? I read the code of the kernel
> > but I am still not sure.
>
> load-balancing is generally considered part of the scheduler as a
> whole, while CFS is indeed the cpu scheduler, it and the
> load-balancer are related because they do have to work together.
>
> Now, in the past 3+years the load-balancer has undergone
> significant changes too -- and we're now again poking at it, .32
> will likely have quite radical changes to the whole load balancer.
>
> The sched_relax_domain_level knob is one that controls one of the
> coupling mechanisms, namely wake on idle, that is, we try and push
> newly woken tasks away to idle cpus. The level you put in there is
> related to the sched_domain level.
>
> Normally we don't try and push newly woken tasks too far away,
> because that'll increase the remote access penalty for related
> tasks, but some workloads have lots of very short running
> unrelated tasks which do benefit from this.
>
> Anyway, I would suggest you keep an eye out for scheduler patches
> if you're interested in this, all the scheduler development
> happens in -tip.

Which can be tested via:

http://people.redhat.com/mingo/tip.git/README

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/