Re: [PATCH 10/13] remove aggressive idle balancing

From: Nick Piggin
Date: Mon Mar 07 2005 - 03:30:00 EST


Siddha, Suresh B wrote:
Nick,

On Mon, Mar 07, 2005 at 04:34:18PM +1100, Nick Piggin wrote:


Active balancing should only kick in after the prescribed number
of rebalancing failures - can_migrate_task will see this, and
will allow the balancing to take place.


We are resetting the nr_balance_failed to cache_nice_tries after kicking active balancing. But can_migrate_task will succeed only if
nr_balance_failed > cache_nice_tries.


It is indeed, thanks for catching that. We should probably make it
reset the count to the point where it will start moving cache hot
tasks (ie. cache_nice_tries+1).

I'll look at that and send Andrew a patch.


That said, we currently aren't doing _really_ well for SMT on
some workloads, however with this patch we are heading in the
right direction I think.


Lets take an example of three packages with two logical threads each. Assume P0 is loaded with two processes(one in each logical thread), P1 contains only one process and P2 is idle.

In this example, active balance will be kicked on one of the threads(assume
thread 0) in P0, which then should find an idle package and move it to one of the idle threads in P2.

With your current patch, idle package check in active_load_balance has disappeared, and we may endup moving the process from thread 0 to thread 1 in P0. I can't really make logic out of the active_load_balance code after your patch 10/13


Ah yep, right you are there, too. I obviously hadn't looked closely
enough at the recent active_load_balance patches that had gone in :(
What should probably do is heed the "push_cpu" prescription (push_cpu
is now unused).

I think active_load_balance is too complex at the moment, but still
too dumb to really make the right choice here over the full range of
domains. What we need to do is pass in some more info from load_balance,
so active_load_balance doesn't need any "smarts".

Thanks for pointing this out too. I'll make a patch.


I have been mainly looking at tuning CMP Opterons recently (they
are closer to a "traditional" SMP+NUMA than SMT, when it comes
to the scheduler's point of view). However, in earlier revisions
of the patch I had been looking at SMT performance and was able
to get it much closer to perfect:



I am reasonably sure that the removal of cpu_and_siblings_are_idle check
from active_load_balance will cause HT performance regressions.


Yep.


I was working on a 4 socket x440 with HT. The problem area is
usually when the load is lower than the number of logical CPUs.
So on tbench, we do say 450MB/s with 4 or more threads without
HT, and 550MB/s with 8 or more threads with HT, however we only
do 300MB/s with 4 threads.


Are you saying 2.6.11 has this problem?


I think so. I'll have a look at it again.


Those aren't the exact numbers, but that's basically what they
look like. Now I was able to bring the 4 thread + HT case much
closer to the 4 thread - HT numbers, but with earlier patchsets.
When I get a chance I will do more tests on the HT system, but
the x440 is infuriating for fine tuning performance, because it
is a NUMA system, but it doesn't tell the kernel about it, so
it will randomly schedule things on "far away" CPUs, and results
vary.


Why don't you use any other simple HT+SMP system?


Yes I will, of course. Some issues can become more pronounced
with more physical CPUs, but the main reason is that the x440
is the only one with HT at work where I was doing testing.

To be honest I hadn't looked hard enough at the HT issues yet
as you've noticed. So thanks for the review and I'll fix things
up.

I will also do some performance analysis with your other patches
on some of the systems that I have access to.


Thanks.

Nick

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/