[patch 5/7] sched: ratelimit select_idle_sibling()for sync wakeups

From: Mike Galbraith
Date: Tue Nov 22 2011 - 09:29:31 EST



select_idle_sibling() injures synchromous loads horribly
on processors with L3 (ala westmere) when called at high
Frequency. Cut it off at 40 KHz (25usec event rate) when
waking sync, in lieu of an inter-domain cache penalty.

Signed-off-by: Mike Galbraith <efault@xxxxxx>

---
kernel/sched/fair.c | 20 ++++++++++++++++++--
kernel/sched/features.h | 6 ++++++
2 files changed, 24 insertions(+), 2 deletions(-)

Index: linux-3.0-tip/kernel/sched/fair.c
===================================================================
--- linux-3.0-tip.orig/kernel/sched/fair.c
+++ linux-3.0-tip/kernel/sched/fair.c
@@ -2643,6 +2643,21 @@ find_idlest_cpu(struct sched_group *grou
}

/*
+ * select_idle_sibling() injures synchromous loads horribly
+ * on processors with L3 (ala westmere) when called at high
+ * Frequency. Cut it off at 40 KHz (25usec event rate) when
+ * waking sync, in lieu of an inter-domain cache penalty.
+ */
+#define SIBLING_SYNC_CUTOFF_NS (NSEC_PER_SEC/40000UL)
+
+static int idle_sibling_limit(int target, int sync)
+{
+ if (!sync || !sched_feat(SIBLING_LIMIT_SYNC))
+ return 0;
+ return cpu_rq(target)->avg_event < SIBLING_SYNC_CUTOFF_NS;
+}
+
+/*
* Try and locate an idle CPU in the sched_domain.
*/
static int select_idle_sibling(struct task_struct *p, int target)
@@ -2790,9 +2805,10 @@ select_task_rq_fair(struct task_struct *

if (affine_sd) {
if (cpu == prev_cpu || wake_affine(affine_sd, p, sync))
- prev_cpu = cpu;
+ new_cpu = cpu;

- new_cpu = select_idle_sibling(p, prev_cpu);
+ if (!idle_sibling_limit(new_cpu, sync))
+ new_cpu = select_idle_sibling(p, new_cpu);
goto unlock;
}

Index: linux-3.0-tip/kernel/sched/features.h
===================================================================
--- linux-3.0-tip.orig/kernel/sched/features.h
+++ linux-3.0-tip/kernel/sched/features.h
@@ -68,3 +68,9 @@ SCHED_FEAT(TTWU_QUEUE, 1)

SCHED_FEAT(FORCE_SD_OVERLAP, 0)
SCHED_FEAT(RT_RUNTIME_SHARE, 1)
+
+/*
+ * Restrict the frequency at which select_idle_sibling() may be called
+ * for synchronous wakeups.
+ */
+SCHED_FEAT(SIBLING_LIMIT_SYNC, 1)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/