[PATCH] sched/fair: Skip wake_affine() for core siblings

From: Kirill Tkhai
Date: Fri Sep 25 2015 - 13:54:20 EST


We are not interested in actual target if both prev
and curr cpus share CPU cache. select_idle_sibling()
searches in top-down order; top level is the same
for both of them, and the result will be the same.
So, we can save a little CPU cycles and cache misses
and skip wake_affine() calculations.

tbench on 2 physical CPU Xeon (x 6 cores x 2 ht) inside cgroup:

threads | Before | After
-------------------------------------------
1 | 203.943 MB/sec | 211.524 MB/sec
2 | 407.211 MB/sec | 411.701 MB/sec
3 | 591.089 MB/sec | 608.404 MB/sec
4 | 743.768 MB/sec | 790.026 MB/sec (+ 6.2%)
5 | 914.237 MB/sec | 972.882 MB/sec (+ 6.4%)
6 | 1053.91 MB/sec | 1092.81 MB/sec
7 | 1208.24 MB/sec | 1281.1 MB/sec (+ 6.0%)
8 | 1357.53 MB/sec | 1385.79 MB/sec
9 | 1474.11 MB/sec | 1496.76 MB/sec
10 | 1586.89 MB/sec | 1616.76 MB/sec
11 | 1720.17 MB/sec | 1732.7 MB/sec
12 | 1835.4 MB/sec | 1868.77 MB/sec
13 | 1964.76 MB/sec | 2003.68 MB/sec
14 | 2117.01 MB/sec | 2128.16 MB/sec
15 | 2220.97 MB/sec | 2254.8 MB/sec
16 | 2326.52 MB/sec | 2378.38 MB/sec
17 | 2458.79 MB/sec | 2484.15 MB/sec
18 | 2473.59 MB/sec | 2591.01 MB/sec (+ 4.7%)

Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxx>
---
kernel/sched/fair.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4df37a4..b378c34 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4666,6 +4666,9 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
unsigned long weight;
int balanced;

+ if (sd->flags & SD_SHARE_PKG_RESOURCES)
+ return 1;
+
idx = sd->wake_idx;
this_cpu = smp_processor_id();
prev_cpu = task_cpu(p);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/