Re: [patch] sched: fix improper load balance across sched domain

From: Ingo Molnar
Date: Wed Oct 17 2007 - 03:20:31 EST



* Ken Chen <kenchen@xxxxxxxxxx> wrote:

> We recently discovered a nasty performance bug in the kernel CPU load
> balancer where we were hit by 50% performance regression.
>
> When tasks are assigned to a subset of CPUs that span across
> sched_domains (either ccNUMA node or the new multi-core domain) via
> cpu affinity, kernel fails to perform proper load balance at these
> domains, due to several logic in find_busiest_group() miss identified
> busiest sched group within a given domain. This leads to inadequate
> load balance and causes 50% performance hit.
[...]
> So proposing the following fix: add addition logic in
> find_busiest_group to detect intrinsic imbalance within the busiest
> group. When such condition is detected, load balance goes into spread
> mode instead of default grouping mode.

thanks - i've added your fix to the scheduler queue, and i'll check it
with a few workloads too. (Right now the scheduler queue is blocked by a
showstopper crasher bug in group scheduling and we are trying to fix
that first, before doing any other change.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/