[tip:sched/core] sched/balancing: Reduce the rate of needless idle load balancing

From: tip-bot for Tim Chen
Date: Thu Jun 05 2014 - 10:35:56 EST


Commit-ID: ed61bbc69c773465782476c7e5869fa5607fa73a
Gitweb: http://git.kernel.org/tip/ed61bbc69c773465782476c7e5869fa5607fa73a
Author: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
AuthorDate: Tue, 20 May 2014 14:39:27 -0700
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 5 Jun 2014 11:52:01 +0200

sched/balancing: Reduce the rate of needless idle load balancing

The current no_hz idle load balancer do load balancing for *all* idle cpus,
even though the time due to load balance for a particular
idle cpu could be still a while in the future. This introduces a much
higher load balancing rate than what is necessary. The patch
changes the behavior by only doing idle load balancing on
behalf of an idle cpu only when it is due for load balancing.

On SGI's systems with over 3000 cores, the cpu responsible for idle balancing
got overwhelmed with idle balancing, and introduces a lot of OS noise
to workloads. This patch fixes the issue.

Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Acked-by: Russ Anderson <rja@xxxxxxx>
Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>
Reviewed-by: Jason Low <jason.low2@xxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Len Brown <len.brown@xxxxxxxxx>
Cc: Dimitri Sivanich <sivanich@xxxxxxx>
Cc: Hedi Berriche <hedi@xxxxxxx>
Cc: Andi Kleen <andi@xxxxxxxxxxxxxx>
Cc: MichelLespinasse <walken@xxxxxxxxxx>
Cc: Peter Hurley <peter@xxxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1400621967.2970.280.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b71d8c3..7a0c000 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7193,12 +7193,17 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle)

rq = cpu_rq(balance_cpu);

- raw_spin_lock_irq(&rq->lock);
- update_rq_clock(rq);
- update_idle_cpu_load(rq);
- raw_spin_unlock_irq(&rq->lock);
-
- rebalance_domains(rq, CPU_IDLE);
+ /*
+ * If time for next balance is due,
+ * do the balance.
+ */
+ if (time_after_eq(jiffies, rq->next_balance)) {
+ raw_spin_lock_irq(&rq->lock);
+ update_rq_clock(rq);
+ update_idle_cpu_load(rq);
+ raw_spin_unlock_irq(&rq->lock);
+ rebalance_domains(rq, CPU_IDLE);
+ }

if (time_after(this_rq->next_balance, rq->next_balance))
this_rq->next_balance = rq->next_balance;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/