[tip:sched/core] sched/fair: Reset nr_balance_failed after active balancing

From: tip-bot for Srikar Dronamraju
Date: Thu Mar 31 2016 - 05:28:49 EST


Commit-ID: d02c071183e1c01a76811c878c8a52322201f81f
Gitweb: http://git.kernel.org/tip/d02c071183e1c01a76811c878c8a52322201f81f
Author: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
AuthorDate: Wed, 23 Mar 2016 17:54:44 +0530
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 31 Mar 2016 10:49:44 +0200

sched/fair: Reset nr_balance_failed after active balancing

To force a task migration during active balancing, nr_balance_failed is set
to cache_nice_tries + 1. However nr_balance_failed is not reset. As a side
effect, the next regular load balance under the same sd, a cache hot task
might be migrated, just because nr_balance_failed count is high.

Resetting nr_balance_failed after a successful active balance ensures
that a hot task is not unreasonably migrated. This can be verified by
looking at othe number of hot task migrations reported by /proc/schedstat.

Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1458735884-30105-1-git-send-email-srikar@xxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0fe30e6..cbb075e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7399,10 +7399,7 @@ more_balance:
&busiest->active_balance_work);
}

- /*
- * We've kicked active balancing, reset the failure
- * counter.
- */
+ /* We've kicked active balancing, force task migration. */
sd->nr_balance_failed = sd->cache_nice_tries+1;
}
} else
@@ -7637,10 +7634,13 @@ static int active_load_balance_cpu_stop(void *data)
schedstat_inc(sd, alb_count);

p = detach_one_task(&env);
- if (p)
+ if (p) {
schedstat_inc(sd, alb_pushed);
- else
+ /* Active balancing done, reset the failure counter. */
+ sd->nr_balance_failed = 0;
+ } else {
schedstat_inc(sd, alb_failed);
+ }
}
rcu_read_unlock();
out_unlock: