[tip:sched/core] sched: Skip double execution of pick_next_task_fair()

From: tip-bot for Peter Zijlstra
Date: Thu May 08 2014 - 06:43:00 EST


Commit-ID: 6ccdc84b81a0a6c09a7f0427761d2f8cecfc2218
Gitweb: http://git.kernel.org/tip/6ccdc84b81a0a6c09a7f0427761d2f8cecfc2218
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Thu, 24 Apr 2014 12:00:47 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 7 May 2014 11:51:35 +0200

sched: Skip double execution of pick_next_task_fair()

Tim wrote:

"The current code will call pick_next_task_fair a second time in the
slow path if we did not pull any task in our first try. This is
really unnecessary as we already know no task can be pulled and it
doubles the delay for the cpu to enter idle.

We instrumented some network workloads and that saw that
pick_next_task_fair is frequently called twice before a cpu enters
idle. The call to pick_next_task_fair can add non trivial latency as
it calls load_balance which runs find_busiest_group on an hierarchy of
sched domains spanning the cpus for a large system. For some 4 socket
systems, we saw almost 0.25 msec spent per call of pick_next_task_fair
before a cpu can be idled."

Optimize the second call away for the common case and document the
dependency.

Reported-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Len Brown <len.brown@xxxxxxxxx>
Link: http://lkml.kernel.org/r/20140424100047.GP11096@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/core.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e62c65a..28921ec 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2592,8 +2592,14 @@ pick_next_task(struct rq *rq, struct task_struct *prev)
if (likely(prev->sched_class == class &&
rq->nr_running == rq->cfs.h_nr_running)) {
p = fair_sched_class.pick_next_task(rq, prev);
- if (likely(p && p != RETRY_TASK))
- return p;
+ if (unlikely(p == RETRY_TASK))
+ goto again;
+
+ /* assumes fair_sched_class->next == idle_sched_class */
+ if (unlikely(!p))
+ p = idle_sched_class.pick_next_task(rq, prev);
+
+ return p;
}

again:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/